Game Driven Software Development for NPOs the Scientific Community Game (SCG)

NP Optimization Problems (NPOs) NPOs are approximated using (ensembles of) heuristics. Foster development and innovation of heuristics.

Fostering Heuristics Development Feedback! Analyze the performance of heuristics in a niche to form better ensembles. Parameter tuning. Bug fixes.

Fostering Heuristics Innovation Analyzing the niches within the problem domain. Constructing hard problems. Hints!

Game Driven Software Development A number of autonomous teams. Each team develops an agent that embodies their own heuristics. Agents participate in a contest. Contest winners get an egoistic boost. Teams develop their agents for the next contest.

Why It Works? Autonomy: teams/agents are free to innovate and develop their heuristics. Mastery: getting better supports the team’s/agent’s ego. Purpose: accumulate new shared knowledge about a specific NPO.

Game Driven Development Has Worked Before! Renaissance mathematicians. SAT competitions.

The SCG(X) Game

X is a specific predefined NPO problem Domain. e.g. Boolean-MAX-CSP. In every round, agents must propose new hypotheses and oppose other agents’ hypotheses. Agents oppose hypotheses by either strengthening or challenging them.

The SCG(X) Game [cont.] Agents gain reputation when they strengthen hypotheses. Agents challenge hypotheses by engaging in a discounting protocol. Agents gain reputation when they discount hypotheses and lose reputation when they fail to do so.

The SCG(X) Game [cont.] All agents start with the same reputation. The sum of all reputations is preserved. Agent(s) with the highest reputation win(s).

We Didn’t Tell You... How do hypotheses look-like? What is the discounting protocol? How much reputation do agents gain/lose when they strengthen/discount hypotheses?

All-Conjectures Niche lower bound: all problems in niche N of X can be solved with quality of at least Q. Niche upper bound: there exists a problem in niche N of X cannot be solved with quality more than Q. For NPO problems, Q ∈ [0,1]. ??

14 10/16/09Can DM and ML help? Example: SCG(MAXCSP) Challenge Language 1 (all) Domain MAXCSP: maximize fraction of satisfied constraints. Challenge: Alice challenges Bob to discount the statement belief(pred, 0.7): There exists a problem p in subset pred so that for all solutions s to p, quality(p,s) < 0.7. Confidence = 1. Discounting protocol: Alice gives Bob p’, Bob solves it with s’: quality(p’,s’) >= 0.7.

All-conjectures [Example]

Q is too low Q is too high for all P exists S Quality(P,S) >= Q Override. Opponent shares a belief with a higher Q’. O+ Challenge. Opponent provides a problem. Belief holder solves it (S’). O+- exists P for all S Quality(P,S) < Q Challenge. Belief holder provides a problem. Opponent solves it (S’). O+- Override. Opponent shares a belief with a lower Q’. O+ Opposing Beliefs: all

Hypothesis Alice’ Hypothesis: There exists a problem P in niche N of X s.t. for all solutions S Bob searched by the opponent Bob in T seconds. Quality(P, S Bob ) < AR * Quality(P, S Alice ). Hypotheses have an associated confidence [0,1]. Hypothesis:. SQ = Quality(P, S Alice )

18 10/16/09Can DM and ML help? Example: SCG(MAXCSP) Challenge Language 2 (secret) Karl: 1 in three example.

Hypothesis [Example] 1in3 example.

X = Boolean MAXCSP Given a sequence of Boolean constraints formulated using a set R of Boolean relations, find an assignment that maximizes the fraction of satisfied constraints. Is an NPO for most R. Decision version is NP-complete for most R. Niche defined by R. 20

1in3 niche Only relation 1in3 is used. 1in3 problem P: v1 v2 v3 v4 v5 1in3( v1 v2 v3) 1in3( v2 v4 v5) 1in3( v1 v3 v4) 1in3( v3 v4 v5) secret 1 0 0 1 0 Truth Table 1in3 000 0 001 1 010 1 011 0 100 1 101 0 110 0 111 0 Secret quality SQ = 3/4 21

1in3 Hypothesis 1in3 hypothesis H proposed by Alice: exists P in 1in3 niche so that for all S Bob that opponent Bob searches in time t (small constant) seconds: Quality(P,S Bob ) < 0.4 * Quality(P,S Alice ). H = (niche = (1in3), AR =0.4, confidence = 0.8) Bob has clever knowledge that Alice does not have. He opposes the hypothesis H by challenging it using his randomized algorithm. 22

Bob’s clever knowledge 4/9 for 1in3 4/9 for 1in3: For all P in 1in3 niche, exists S so that Quality(P,S) >= 0.444 * SQ. Proof: la(p)=3*p*(1-p) 2 has the maximum 4/9. argmax p in [0,1] la(p) = 1/3. Without search, in PTIME. Derandomize Bob successfully discounts Alice gets a hint Was Bob just lucky? Truth Table 1in3 000 0 001 1 010 1 011 0 100 1 101 0 110 0 111 0 23

1in3 Hypothesis Bob does not know whether 4/9 is best possible. Should check Semidefinite Programming. Bob only knows that the set of 1in3 problems having a solution satisfying 4/9 + eps, eps > 0, is NP-complete. 24

AR is too low AR is too high exists P for all S that opponent searches: Quality(P,S) < AR * SQ Challenge. Hypothesis proposer provides a problem. Opponent solves it. Strengthen. Opponent proposes a hypothesis with a lower AR. Opposing Hypotheses

Q is too low Q is too high for all P exists S Quality(P,S) < Q Discount. Share a belief with a higher Q’. Challenge. Ask belief holder to provide a problem, then solve it (S’). exists P for all S Quality(P,S) >= Q Challenge. Provide a problem, then ask challenger to solve it (S’). Discount. Share a belief with a lower Q’. Questioning Beliefs: secret

All Vs. Secret ConjecturesAll-conjecturesSecret-conjectures Absolute Certainty: confidence 1 Uncertainty: confidence <1 ImpossibilitySmall chance of success statement about all possible assignments statement about assignments that one specific algorithm searches in a given time.

28 10/16/09Can DM and ML help? Properties of challenge language Doing discounting and supporting requires constructive skills. Uncertainty about which problem to be delivered. Optional: mathematical skills –When agents are perfect, supporting implies the statement is a theorem and discounting implies the statement is NOT a theorem (a counter example was found).

Reputation Gain Hypothesis have credibility [0, ∞ ]. The credibility of a hypothesis is proportional to agent’s confidence in the hypothesis and agent’s reputation. Reputation gain is proportional to the discounting factor and the hypothesis credibility. The discounting factor [-1,1]. 1 means the hypothesis is completely discounted.

AR is too low AR is too high exists P for all S that opponent searches: Quality(P,S) < AR * SQ Quality(P,S’) - AR * SQ strengthens: AR - AR’. Discounting Factor

H1 = ((1in3), AR = 1.0, confidence = 1.0) H1 proposed by Alice: exists P in 1in3 niche so that for all S that opponent Bob searches: Quality(P,S) < 1.0 * SQ. This is a reasonable hypothesis if Alice is sure that her secret assignment is the maximum assignment when she provides a sufficiently big problem to Bob. 31

What we did not tell you so far A game defines some configuration constants: a maximum problem size For example, all problems in the niche can have at most 1 million constraints. A maximum time bound for all tasks (propose, oppose, provide, solve), e.g. 60 seconds. An initial reputation, e.g., 100. When reputation becomes negative, agent has lost. 32

Discounting Factor: ReputationGain for Strengthening H1 = ((1in3), AR = 1.0, confidence = 1.0) H1 proposed by Alice: exists P in 1in3 niche so that for all S that opponent Bob searches: Quality(P,S) < 1.0 * SQ. Bob thinks he can strengthen H1 to H2 = (MAXCSP, niche = secret ExistsForAll (1in3), AR = 0.9, confidence = 1.0). DiscountingFactor 1.0-0.9 = 0.1. ReputationGain for Bob = 0.1 * 1.0 * AliceReputation. Alice gets her reputation back if she discounts H2. 33

Discounting Factor ReputationGain for Discounting H = ((1in3), AR = 0.4, confidence = 1.0) H proposed by Alice: exists P in 1in3 niche so that for all S that opponent Bob searches: Quality(P,S) < 0.4 * SQ. Bob knows he can discount H based on this knowledge: 4/9 for 1in3. Let’s assume he achieves 0.45 on Alice’ problem. DiscountingFactor 0.45 – 0.4 = 0.05. ReputationGain for Bob = 0.05*1.0*AliceReputation. 34

Discounting Factor ReputationGain for Supporting H = ((1in3), AR = 0.4, confidence = 1.0) H proposed by Alice: exists P in 1in3 niche so that for all S that opponent Bob searches: Quality(P,S) < 0.4 * SQ. Bob knows he can discount H based on this knowledge: 4/9 for 1in3. Let’s assume he achieves 0.3 on Alice’ problem. Bob has a bug somewhere! DiscountingFactor 0.3 – 0.4 = -0.1 ReputationLoss for Bob = -0.1*1.0*AliceReputation. 35

Mechanism Design The exact SCG(X) mechanism is still a work in progress. SCG(X) mechanism must be sound: Encourage productive behavior and discourage unproductive behavior of scientists. The agent with best heuristics wins.

Tools to facilitate use of SCG(X) Definition of X. Generate a client-server infrastructure for playing SCG(X) on the web. Administrator enforces SCG(X) rules: client. Baby agents: servers. They can communicate and play an uninteresting game. Baby agents get improved by their caregivers, register with Administrator and the game begins at midnight.

Properties

SCIENTIFIC COMMUNITY

40 10/16/09Can DM and ML help? SCG: a scientific market game Domain X (Problem Solving domain such as an NPO domain) Agents with a reputation: offer-accept-deliver-solve Agents offer challenges with a confidence Agents accept challenges Discounting protocol for challenges: deliver-solve Agent wins reputation –when it accepts and discounts a challenge of another agent (challenge confidence * offerer reputation * at-risk). –when it supports its own challenge that was accepted by an agent (challenge confidence * acceptor reputation * at-risk).

41 10/16/09Can DM and ML help? Think of a scientific community about domain X Scientists have reputations Scientists offer statements with a confidence Scientists question statements (accept) Scientists use discounting protocol (deliver- solve) Scientists win and loose reputation

42 10/16/09Can DM and ML help? Scientific Market SCG(X) Defined by –Generic SCG game Axioms A mechanism (game rules) satisfying the axioms – Description of NPO X Feasible solutions, objective function –Belief language for X Predicates for defining niches (subsets of problems in X) Belief predicates Purpose of game: Good scientific behavior in domain X is rewarded.

Scientists and virtual Scientists! Are encouraged to offer results that are not easily improved. offer results that they can successfully support. quote related work and show how they improve on previous work. prove results if the current state of the art allows. publish results of an experimental nature with an appropriate confidence level. stay active and publish new beliefs or oppose current beliefs. be well-rounded: solve posed problems and pose difficult problems for others (Like the Four Color Conjecture). become famous!

Productive Scientific Behavior (1) The agents propose hypotheses that are difficult to strengthen or challenge (i.e. non-trivial yet correct). Otherwise, they lose reputation to their opponents. Offer results that cannot be easily improved. Offer results that they can successfully support.

Good scientific behavior (2) Opposing a belief comes in two flavors. The agents should share “tight” beliefs. Agents who share a belief that is not tight lose reputation and the agents who tighten a belief win reputation unless the tightened belief is discounted by some other agent. offer results that are not easily improved. quote related work and show how they improve on previous work. The agents should share beliefs that are difficult to discount. Agents who share beliefs that are discountable lose reputation and the challengers who successfully discount win reputation. offer results that they can successfully support.

Productive Scientific Behavior (2) Agents are encouraged to propose hypotheses they are not sure about. But they need to fairly express their confidence in their hypotheses. If the confidence is inappropriately high, they lose too much reputation if the hypothesis is successfully discounted. If the confidence is inappropriately low, they don’t win enough reputation if the hypothesis is successfully supported. publish results of an experimental nature with an appropriate confidence level.

Productive Scientific Behavior (3) Agents stay active. In each “round”, they must propose new hypotheses and oppose other agents hypotheses. stay active and publish new hypotheses or oppose current hypotheses. Agents maximize their reputation. become famous!

Productive Scientific Behavior (4) When Alice loses reputation to Bob, Alice can learn from Bob: Alice has a bug in her software. Bob has skills superior to hers. Alice should try to acquire Bob’s skills. Learn from mistakes. Be careful how you oppose a Nobel Laureate. The risks are high.

Unproductive Scientific Behavior Cheating is forbidden: you can only succeed through good scientific behavior (by adding useful hypotheses or by successfully opposing hypotheses in the knowledge base).

Fair Scientific Community All agents start with the same initial reputation. The winner has the best skills in domain X within the set of participating agents.

Properties Agents are penalized for unproductive behaviors. A behavior is unproductive if it does not possibly lead to the accumulation of new knowledge about the specific NPO problem. Equilibrium. Agent with the best heuristics wins the game. Two player games + tournament.

Applications

Improving the research approach Problem to be solved: Develop the best practical algorithms for solving NPO X. Standard solution: Write hundreds of papers on the topic with isolated implementations. What are the best practical algorithms? Our solution: Use the virtual scientific agent community SCG(X) with a suitably designed hypotheses language to compare the algorithms. The winning agent has the best practical algorithms.

Game works at the press of a button to determine the winner. The winner has the best skills in the chosen domain. Find the experientially best algorithms for solving problems in domain X. Evaluation tool. The feedback is constructive. Testing and Learning Tool. Grading Tool. Over time, the market will collect undiscounted challenges: Belief Maintenance System. Agents must be reliable: Teaching Software Engineering Tool. Grading Tool.

55 10/16/09Can DM and ML help? What is a scientific virtual market game good for? Market works at the press of a button to determine the winner. –The winner has the best skills in the chosen domain. Evaluation tool. –The feedback is constructive. Testing and Learning Tool. Grading Tool. –Over time, the market will collect undiscounted challenges: Belief Maintenance System. –Agents must be reliable: Teaching Software Engineering Tool. Grading Tool.

Contributing to State of the art knowledge of domain X

57 10/16/09Can DM and ML help? Applications of SCG(X) Find the experientially best algorithms for solving problems in X.

Teaching

Agent World for SCG(X) Agent Caregiver lives outside SCG(X) world World-class experts in domain X. Graduate and undergraduate students Studying domain X. Studying material needed to solve problems in X. Learning algorithms based on game histories. Agent lives inside SCG(X) world Agent win and lose reputation. Agent Caregiver prepares agent for next game.

Needed when agent caregiver is human. Knowledge about domain X needs to be developed by students or taught to them and understood and put into algorithms (propose-oppose(strengthen-challenge)- provide-solve) that go into the agent. This tests both whether the knowledge about X is understood as well as the programming skills. Teaching: Survival Skills in SCG(X)

[Scientific Innovation in X] Agents get skills programmed into them by clever scientists in domain X. Scientists use data mining to learn from competitions and manually improve the agents. [Machine Learning Innovation in X] Agents get skills programmed into them by an agent caregiver programmed with learning skills and data mining skills for domain X. Agent gets updated automatically between competitions and they improve automatically. Teaching: Survival Skills in SCG(X) [cont.]

62 10/16/09Can DM and ML help? Software Development Skills Needed when agent caregiver is human. Knowledge about domain X needs to be developed by students or taught to them and understood and put into algorithms (offer- accept-deliver-solve) that go into the agent. This tests both whether the knowledge about X is understood as well as the programming skills.

AthenaLightningSweetStepdadPeon Athena3030 Lightning0 Sweet0 Stepdad3 Peon1

64 10/16/09Can DM and ML help? Skills needed to survive in SCG(X) [Scientific Innovation in X] Agents get skills programmed into them by clever scientists in domain X. Scientists use data mining to learn from competitions and manually improve the agents. [Machine Learning Innovation in X] Agents get skills programmed into them by an agent caregiver programmed with learning skills and data mining skills for domain X. Agent gets updated automatically between competitions and they improve automatically.

Possible Application Domain For DM/ML/AI

SCG(X) produces history Proposer’s reputation: 120 Hypothesis10 proposer1 opposer2 confidence 1 Problem delivered Solution found: discountFactor = 1 Opposer: increase in reputation: 1 * 1 * 120 = 120

Blame assignment Where is the proposer to blame? Bad hypothesis that is discountable. Bug in problem finding algorithm. Bug in problem solving algorithm used to check proposed hypothesis.

Creating Agents An agent is composed of 6 components: Agent =. Components can refer to each other. Given a set of agents: Agent 1... Agent n Composed agent is a 12-tuple:. Propose, Oppose, Strengthen, Challenge, Provide, Solve 1=own 0=other

Creating Agents [cont.] PropI, OppI, StrI, ChaI, ProvI, SolI ∈ [1..n]. PropO consist of 5-bits, each denote one of the other components. The first bit describes whether to use the opposition component of agent PropI or agent OppI.

IMPLEMENTATION

Tools to facilitate use of SCG(X) Definition of X. Generate a client-server infrastructure for playing SCG(X) on the web. Administrator enforces SCG(X) rules: client. Baby agents: servers. They can communicate and play an uninteresting game. Baby agents get improved by their caregivers, register with Administrator and the game begins at midnight.

Conclusions We have shown how a virtual scientific community of agents can foster the development and innovation of heuristics for approximating NPOs. We need your input on how DM and ML could help with evolving the agents.

Questions?

74 10/16/09Can DM and ML help? Pending When belief is discounted: offer complement of belief. Belief holder = agent that successfully discounted.

75 10/16/09Can DM and ML help? Discounting If Alice offers the belief (FourColorConjecture, confidence = 1.0), she must be ready to support it. –The opponent Bob gives Alice a planar graph. –Alice must deliver a 4-coloring. If she does not, Bob has successfully discounted Alice’ belief and Alice loses reputation and Bob gains. If she does, Alice has successfully defended her belief and Alice wins reputation and the opponent Bob loses. –Note that discounting is different from finding a counterexample. If Alice loses she has a “fault” in her coloring algorithm.

76 10/16/09Can DM and ML help? Beliefs: Four color conjecture FourColorConjecture: For all graphs g satisfying the predicate planar(g) there exists a 4-coloring of the nodes of g such that no two adjacent nodes have the same color. ForAllExists belief: For all problems p satisfying predicate pred(p) there exists a solution s satisfying a property(p,s).

Undiscounted beliefs represent the accumulated shared knowledge gained from the game. (Requires negation and reoffer of discounted beliefs?)

Improving the research approach Problem to be solved: Develop the best practical algorithms for solving NPO X. Standard solution: Write hundreds of papers on the topic with isolated implementations. What are the best practical algorithms? Our solution: Use the virtual scientific agent community SCG(X) with a suitably designed hypotheses language to compare the algorithms. The winning agent has the best practical algorithms.

Game Driven Software Development for NPOs the Scientific Community Game (SCG)

Similar presentations

Presentation on theme: "Game Driven Software Development for NPOs the Scientific Community Game (SCG)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Game Driven Software Development for NPOs the Scientific Community Game (SCG)

Similar presentations

Presentation on theme: "Game Driven Software Development for NPOs the Scientific Community Game (SCG)"— Presentation transcript:

Similar presentations

About project

Feedback