Presentation is loading. Please wait.

Presentation is loading. Please wait.

LIAM 2: A NEW OPEN SOURCE DEVELOPMENT TOOL FOR DISCRETE-TIME DYNAMIC MICROSIMULATION MODELS GIJS DEKKERS (FEDERAL PLANNING BUREAU, Brussels, CESO, K.U.LEUVEN.

Similar presentations


Presentation on theme: "LIAM 2: A NEW OPEN SOURCE DEVELOPMENT TOOL FOR DISCRETE-TIME DYNAMIC MICROSIMULATION MODELS GIJS DEKKERS (FEDERAL PLANNING BUREAU, Brussels, CESO, K.U.LEUVEN."— Presentation transcript:

1 LIAM 2: A NEW OPEN SOURCE DEVELOPMENT TOOL FOR DISCRETE-TIME DYNAMIC MICROSIMULATION MODELS GIJS DEKKERS (FEDERAL PLANNING BUREAU, Brussels, CESO, K.U.LEUVEN AND CEPS/INSTEAD, Luxembourg) PHILIPPE LIÉGEOIS (CEPS/INSTEAD and DULBEA, ULB, Brussels) IMA 4th General Conference, Canberra Dec 9th, 2013

2 2 GENERAL MOTIVATION OF THE TRAINING SESSION  Introduce LIAM2, a free, open source, user-friendly modelling and simulation framework  Basic functionalities, by examples and practice (“learning- by-doing”) : when back home, being ready to use LIAM2 and entering a process of elaborating own developments (MSM model)  Other more advanced topics, latest developments, by examples : making you aware of (new) possibilities or technical difficulties  Documentation : slides (just gathering essential information by topic), some examples and “UserGuide” (release 0.7.0) included in the LIAM2 bundle (http://liam2.plan.be) + Google grouphttp://liam2.plan.be G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA

3 3 W HY LIAM2 AND W HERE DOES IT COME FROM ?  Most existing microsimulation models have been developed by separate (teams of) researchers.  The drawback of each team working on its own is that they have to put a lot of time and effort in the customary development simulation tools… which makes microsimulation models even more expensive than strictly necessary.  Furthermore, as modellers often are not professional programmers, the result is not necessarily the most efficient in terms of simulation speed.  This is the reason why several partners joined their efforts to develop a dynamic Microsimulation modeling toolbox (“LIAM2”) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA

4 4 W HY LIAM2 AND W HERE D OES I T C OME F ROM ?  Initialized through a collaboration between the Federal Planning Bureau in Brussels (development), the CEPS/INSTEAD and the General Inspectorate of Social Security in Luxembourg (testing and complementary funding) and Cathal O’Donoghue (LIAM and expertise) as well as other experts, under European funding (MiDaL Project 2009-2011, PROGRESS programme, Grant VS/2009/0569, CEPS/INSTEAD)  Most of the technical job for LIAM2 done in Brussels (Gaëtan de Menten, Geert Bryon, Raphaël Desmet and Gijs Dekkers)  Open source, User-friendly and Efficient :  A clear separation between “modellers” (responsible for the modelling) and “programmers” (in charge of the development of critical methodological issues, including state-of-the-art methods for data-handling and simulation optimization)  Implementation of language which is easy to use for the modellers G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA

5 5 C ONTENTS OF THE T RAINING S ESSION 1)Getting started with LIAM2 2)A rudimentary model, creating objects and some output 3)Linking Objects 4)Stochastic simulation 5)Marriage market (Matching function) 6)Importing data towards LIAM2 input format 7)Advanced topics 8)Conclusions, including building a model with LIAM2 G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA

6 6 1. G ETTING STARTED WITH LIAM2  Just download and use it (demonstration) S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA

7 7 2. A R UDIMENTARY M ODEL - 2.1 T HE BRICKS  LIAM2 involving several kinds of « bricks », among which…  ENTITIES : objects (persons, households, firms, cell, …) with a unique identifier  FIELDS : attributes of an entity (e.g. person’s age)  LINKS : relation between entities (e.g. person’s children) ; can lead to subsequent use (e.g. spouse.mother.age)  GLOBALS : a parameter not related to a specific entity, may vary through time (e.g. CPI)  PROCESSES : assignments, which change the value of a variable (e.g. « age+1 ») using an expression, and actions which do not (e.g. remove dead person)  MACROS : piece of code, re-evaluated each time it is referenced (e.g. « WIDOW: civilstate == 5) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

8 8 2.2 O VERALL STRUCTURE OF A M ODEL  A model is typically composed of 3 main blocks…  « globals »  « entities » : including their fields, links and definition of processes (order not meaningful) and macros available for that kind of entity  « simulation » : including the general setup of the model, e.g. input, output, starting period of simulation, number of periods of simulation, etc  Within the blocks, indentation is meaningful (cf. YAML-markup language) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

9 globals: periodic: - CPI: float# Consumer Price Index -... entities: household:... person: fields: - age: int - gender: bool# “ 0 ” if female, “ 1 ” if male - m_id: int# mother ’ s identifier -... macros: FEMALE: gender == 0... links:# first one : Mother to Children mc: {type: one2many, target: person, field: m_id}... processes: age: "age + 1"... divorce: "..."# the process is here DECLARED/SPECIFIED only... simulation: processes: - person: [ year, age,... divorce,# the process now SIMULATED/used... ] input: path: "INPUT_DATA" file: "INIT_MODEL_LXG_INPUT_2007.h5"# “ HDF5 ” format output: path: "OUTPUT_DATA" file: "FINAL_MIDAS_LXG_RESULTS.h5“# “ HDF5 ” format start_period: 2008 # first simulated period periods: 20 LXG MIDAS MODEL (2012) A global view first… Gijs Dekkers and Philippe Liégeois - TRAINING LIAM2 - IMA Conference 2013 - Canberra

10 MIDAS_LU - 1st Part : declaring objects… # # LUXEMBOURG MIDAS MODEL # FINAL VERSION, AS ON 14 MAR 2012 # globals: periodic: - CPI: float# Consumer Price Index -... entities: household:... person: fields: - age: int - gender: bool# “ 0 ” if female, “ 1 ” if male - m_id: int# mother ’ s identifier -... macros: FEMALE: gender == 0... links:# first one : Mother to Children mc: {type: one2many, target: person, field: m_id}... processes: age: "age + 1"... divorce: "..."# the process is here DECLARED only...

11 Gijs Dekkers and Philippe Liégeois - TRAINING LIAM2 - IMA Conference 2013 - Canberra … then 2 nd Part : simulating … simulation: processes: - person: [ year, age,... divorce,# the process now SIMULATED/used... ] input: path: "INPUT_DATA" file: "INIT_MODEL_LXG_INPUT_2007.h5"# “ HDF5 ” format output: path: "OUTPUT_DATA" file: "FINAL_MIDAS_LXG_RESULTS_YEARS_2007_2009.h5" start_period: 2008 # first simulated period periods: 2 … simulation: processes: - person: [ year, age,... divorce,# the process now SIMULATED/used... ] input: path: "INPUT_DATA" file: "INIT_MODEL_LXG_INPUT_2007.h5"# “ HDF5 ” format output: path: "OUTPUT_DATA" file: "FINAL_MIDAS_LXG_RESULTS_YEARS_2007_2009.h5" start_period: 2008 # first simulated period periods: 2

12 12 2.3 I NPUT D ATA AND S TRUCTURE  Demonstration G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS D1

13 13 2.4 O UTPUTTING O UTCOMES (A)  dump(), groupby(), show(), qshow(), csv(), …  dump() ~ « list » Syntax: dump([expr1, expr2,..., filter=filterexpression, missing=value, header=True] Example : Run the model (e.g. bundled Notepad++ editor => F6) dump ( id, hh_id, age, gender, filter = id<20 ) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

14 14 2.4 O UTPUTTING O UTCOMES (B)  groupby() ~ Summary tables Syntax: groupby(expr1[, expr2, expr3,...] [, expr=expression] # « count() » by default [, filter=filterexpression] [, percent=True], …) Example : groupby ( trunc(age/10), gender ) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

15 15 2.4 O UTPUTTING O UTCOMES (C)  show() qshow() Syntax: show(expr1[, expr2, expr3,...]) Example : show ( count(age>=18), avg(age, filter = age>=18 ))) NB : “ show ” is implicit in console environment itself NB2 : qshow() is equivalent to show() but with textual form in output G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

16 16 2.4 O UTPUTTING O UTCOMES (D)  csv() ~ to «.csv » files Syntax: csv(expr1[, expr2, expr3,..., [suffix=’file_suffix’][, fname=’filename’][, mode=’w’ by default / ’a’]) NB : default « fname » is « {entity}_{period} » (e.g. « person_2003.csv ») Example : csv ( avg(income), suffix=’income’) (e.g. « person_2003_income.csv ») G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

17 17 2.5 T HE " INIT " PHASE OF SIMULATION G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

18 18 NB : A BOUT VARIABLES IN LIAM2 - L OCAL /T EMPORARY  Most often, variables are declared in « fieds » section of an entity : they may be assigned a value through processes and will be stored in the output file  But often, you need a variable only to store an intermediate result : simply make an assignment to an undeclared variable  Example : person: fields: # period and id are implicit - age: int - agegroup: int processes: age: age + 1 agediv10: trunc(age / 10) agegroup: agediv10 * 10 agegroup2: agediv10 * 5 G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

19 19 3. L INKING O BJECTS - (A) M ANY 2O NE  Links are of « many2one » or « one2many » types  « many2one » : linking many objects to one only (e.g. children to their mother or members of household to HH) Syntax link_name: {type: many2one, target:, field: } NB : “ field ” logically targeting the “single” side of link Examples (defined within entity : « person ») household: {type: many2one, target: household, field: hh_id} G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS D2

20 20 3. L INKING O BJECTS - (B) O NE 2M ANY  « one2many » : linking one object to several others (e.g. a household to members of household) Syntax (idem many2one) link_name: {type:, target:, field: } NB : “ field ” logically targeting the “single” side of link Examples (defined within entity : « person ») persons: {type: one2many, target: person, field: hh_id} G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

21 21 3. L INKING O BJECTS - (C) G ETTING I NFO OUT OF L INKAGE  For accessing the information through links, just show the route and « chain » steps towards the target : Syntax (basic) link_name.field_name Examples mother.age but also mother.mother.age Aggregate information on one2many context (in « household ») persons.avg(age) but also persons.count(age<=17) Aggregate information on many2one context (in « person ») household.get(persons.avg(age)) mother.get((mother.age+father.age)/2) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

22 22 4. I NTRODUCING S TOCHASTIC S IMULATION 4.1 T HE “C HOICE ” FUNCTION (A)  Up to now, deterministic approach : no random component  Imagine on the contrary an event happening with exogenous probability p : which gender while just born, but also who is dying, marrying, getting a job, etc  The simplest way to « decide » while simulating which will be the outcome of the random event for a given entity is to draw a random number u from an uniform [0,1] distribution. If u > p, then the entity is experiencing the event (e.g. dying).  The same methodology can be applied if a choice between more than 2 options (e.g. attributing a household to a region within the country) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS D2 & 3

23 23 4.1 T HE “C HOICE ” FUNCTION (B)  This is exactly what is aiming to produce the “choice” function in LIAM2.  Suppose i=1..n choice options, each with a probability p i. A choice expression then has the following form: Syntax choice([option_1, option_2,..., option_n], [prob_option_1, prob_option_2,..., prob_option_n])} Example gender_just_born: choice([True, False], [0.51, 0.49]) education_level: choice([1,2,3], [0.1, 0.4, 0.5]) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

24 24 4.2 L OGIT R EGRESSION (A)  Sometimes, we know more about the event which probability to happen may rely on entity’s (personal) characteristics to be made more explicit through e.g. a « logit regression » if a qualitative (dichotomous) choice. e.g. a married couple may be less at risk of divorce if the duration of marriage is longer up to the present period  Formally, if « α + βX » is the combination of characteristics x i which result from a logit regression and « explain » correctly the event under study, then :  First, a « logit score » (which is a probability) can be computed : logit_score(α + βX) = logistic(α + βX - logit(u)) where « u » is a random number from an uniform distribution [0, 1] ( logit(u) = log (u/(1-u)) and logistic(z) = logit -1 (z) )  Second, the decision rule can be the following : if the logit score > 0.5, then the event is happening (e.g. dying), otherwise not. G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

25 25 4.2 “L OGIT _ SCORE ” AND “ LOGIT _ REGR ” F UNCTIONS (B)  In LIAM2, the first task, computing a score or risk, is performed thanks to the logit_score(expression) function which then returns a probability p = logistic(expression - logit(u)) NB : logit_score(0.0) is equivalent to uniform(), then returns a value > 0.5 with probability ½ (prob<1/2 if “<0.0”)  Another function, logit_regr(), is performing both tasks at once and returns a boolean ( True if the event is happening) Syntax logit_regr(expression, [, filter=conditions], …) Examples death: logit_regr(-0.5 + 0.02 * age, filter = age>40) NB : logit_regr(0.0) returns True with probability 0.5 logit_regr(<0) returns True with probability <0.5 G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

26 26 4.3 L OGIT AND S TATE A LIGNMENT  Alternatively, rather than grouding the decision on the threshold 0.5 for the probability p, we can decide to « select » a proportion of entities (by category) which must experience the event : this is « alignment »  The logit_regr syntax encompasses alignment possibilities : Syntax logit_regr(expression, [, filter=conditions] [, align=proportions]) Examples divorce: logit_regr(0.6713593*household.nb_children - 0.0785202*dur_in_couple + 0.1429621*agediff, filter = ISFEMALE and ISMARRIED), align = 'al_p_divorce.csv') G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

27 27 NB : (S TOCHASTIC S IMULATION ) L OGIT E XTENSIONS  A simple use of logit_expr() function is equivalent to a « choice » process : Example dead: if(ISMALE, logit_regr(0.0, align='al_p_dead_m.csv'), logit_regr(0.0, align='al_p_dead_f.csv'))  The logit_expr() function can be split into 2 steps logit_score() and align() in LIAM2, which may make the whole process more flexible (e.g. take): Syntax align(score, proportions [, filter=conditions] [, take=conditions] [, leave=conditions] …) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

28 28 4.4 O THER R EGRESSIONS  Continuous G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

29 29 5. T HE “M ARRIAGE M ARKET ” (“M ATCHING ” FUNCTION )  Marriage market) matches individuals from set 1 with individuals from set 2.  For each individual i in set 1 following a particular order (defined through an “ orderby ” parameter) :  a score is computed for all (unmatched) individuals in set 2 and  the best scoring member from set 2 is chosen for the match with i Syntax matching( set1filter=boolean_expr,set2filter=boolean_expr, orderby=, score=coef1 * field1 + coef2 * other.field2 +...) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS D5

30 30 6. I MPORTING D ATA T OWARDS HDF5 F ORMAT  Demonstrating G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

31 31 7. A DVANCED T OPICS  Arrays  new & clone  align abs  tips & tricks  Common mistakes G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

32 32 A N OTE A BOUT “ NEW ”, “ REMOVE ” AND “ CLONE ” F UNCTIONS  Entities (persons, households) may need to be created from scratch or removed while proceeding  For example :  if a marriage, the 2 partners are forming a single household, which means that one of them at least is leaving his/her former household  If a birth, then a new person ; when dying, a person is removed from the population  Sometimes, we may need to create a “clone” of an existing entity Syntax and Examples (NB : treatments needed with links, etc) new(’entity_name’[, filter=expr][, number=value] *set initial values of selected variables*) clone(filter=new_born and is_twin, gender=choice([True, False], [0.51, 0.49])) remove(dead) G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

33 33 8. C ONCLUSIONS  Including Implementing a model in LIAM2 G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS

34 34 I MPLEMENTING A MODEL IN LIAM2 : THE FULL PATH (A-D) B] Building Micro data (cf. variables) & Macro_based data (cf. alignments & parameters) (e.g. from EUROMOD, survey and/or adm. data C] Building and estimating Behavioral equations (e.g. probability of divorce) A] Structuring the model, depending on OBJECTIVES, starting e.g. from MIDAS_BE : demography, activity status, tax-benefit => variables, parameters (« GLOBALS ») & alignments needed D] Running, Debugging, Outputting, Validating G IJS D EKKERS AND P HILIPPE L IÉGEOIS - TRAINING LIAM2 - IMA C ONFERENCE 2013 - C ANBERRA S TART – M ODEL – L INKS – S TOCHASTIC – M ATCHING – I MPORT – O THERS


Download ppt "LIAM 2: A NEW OPEN SOURCE DEVELOPMENT TOOL FOR DISCRETE-TIME DYNAMIC MICROSIMULATION MODELS GIJS DEKKERS (FEDERAL PLANNING BUREAU, Brussels, CESO, K.U.LEUVEN."

Similar presentations


Ads by Google