Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data: Application requirements, data flow, and person registry Tom Barton University of Chicago.

Similar presentations


Presentation on theme: "Data: Application requirements, data flow, and person registry Tom Barton University of Chicago."— Presentation transcript:

1 Data: Application requirements, data flow, and person registry Tom Barton University of Chicago

2 CAMP Directory Workshop Feb 3-6, 2004 Copyright Tom Barton 2004. This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.

3 CAMP Directory Workshop Feb 3-6, 2004 Outline  Three stages of managing identity information 1.Feeding the person registry - integrating identity from many authoritative sources 2.Processes & business logic at the person registry 3.Feeding consumers of identity information  Some examples sprinkled in  Selected policy & process issues (time permitting)

4 CAMP Directory Workshop Feb 3-6, 2004 Core middleware for an integrated architecture

5 CAMP Directory Workshop Feb 3-6, 2004 Potential sources of identity info  “Big” administrative systems: student systems, payroll/HR systems, academic records systems, financials, telecom mgmt system, alumni systems, library systems, …  “Small” sources: affiliated organizations with fairly simple administrative operations (excel?)  Collateral operational systems: application- specific directories/databases, NOS directories, campus card systems, other metadirectory/ID Mgmt operations  People’s heads: “ad hoc” affiliations, self, proxies

6 CAMP Directory Workshop Feb 3-6, 2004 UofC sources: now  Student info & campus card system by live RDBMS views  Payroll & faculty by periodic batches  Dozen or so “small feeds” by aperiodic upload  Self  Trusted Agents to make temporary and “pre- feed” accounts  370 or so departmental directory reviewers  Network security group

7 CAMP Directory Workshop Feb 3-6, 2004 UofC sources: planning or earnest discussion  Feed from UC Hospitals  Alumni system  Select distributed IT support staff (mail & password resets)  Potentially anyone to manage ad hoc groups

8 CAMP Directory Workshop Feb 3-6, 2004 Feed mechanics  Source system selection criteria –Express the set of affiliation types or constituencies authoritatively represented in the source –Affiliation indicator attributes  Format & transmission technology –Complete selections vs. differentials vs. transactions –Automated vs. semi-manual (eg, maildrop) vs. manual –scp flatfiles, live views, varieties of EAI (what are you using?) –Actual metadirectory products (what are you using?) –Ad hoc record structure, XML (what are you doing?)

9 CAMP Directory Workshop Feb 3-6, 2004 Identity Matching  Matching strategies –Match personal IDs for each source record –Per-source shared identifier with prior matching –Broadly used institutional identifier with prior matching  The query “is this person new” is resolved somewhere, somehow. –Inaccurate answers spoil 1–1 relationship between registry objects and real world subjects –It’s worthwhile to think on how to improve it!  Insert “rational” ID Mgmt spiel here …

10 CAMP Directory Workshop Feb 3-6, 2004 Identity matching at UofC: now  SSN  StudentID (after prior match by SSN)  “CorpID” (mangling of substrings of lastname, SSN)  Several options for identifying “self” as authoritative source

11 CAMP Directory Workshop Feb 3-6, 2004 Identity matching at UofC: upcoming (dose of rationality )  UCID (SSN replacement) assigned as unique key in payroll & student systems at record creation time  Person registry is authoritative source of UCID  “Is this person new” is answered when a new record is to be created in payroll or student systems  Tightly-coupled and loosely-coupled designs are being considered  UC Hospitals feed might also use a similar design

12 CAMP Directory Workshop Feb 3-6, 2004 Canonicalization  Provide simpler, consistent representation of certain data –Name –Phone number(s) –Address(es) –Department names –Names of “major” affiliations  Transformation rules and business logic –Which source trumps name –Phone & address mappings –Rules to determine expressed affiliations

13 CAMP Directory Workshop Feb 3-6, 2004 Fat or thin?  Fat = contains selected data from sources  Thin = contains only links to sources  Issues with thin: –Source system availability –Source system security (apps need creds) –App complexity (feed mechanics, identity matching, canonicalization rules) –Policy complexity (authorize N apps to access M sources)  Issues with fat: –Data freshness –Downstream from canonicalization (usually a pro, but can be a con)  Most campuses are fat!

14 CAMP Directory Workshop Feb 3-6, 2004 Functional requirements for a registry entry  Private primary key –Never reassigned, never revoked –Not used for any other purpose –GUIDs are preferable to uniqueness within a database  Publicly visible key –Available for sources or consumers to use to refer to the person (better than, say, a username) –Probably numeric string <= 9 digits to ensure that it fits in most predefined fields –Reduces exposure in case of disaster with primary key  Crosswalk source and consumer specific identifiers

15 CAMP Directory Workshop Feb 3-6, 2004 Functional requirements for a registry entry  Personal information –answer the “is this person new” query with sufficient accuracy –Support account claiming, initialization, or re-initialization  Storage for whatever’s authoritative in the person registry –Egs: support for provisioning, email, username(s)  Information obtained from source systems that is valuable to authorization or entitlement algorithms and policies  The entry and its principal identifiers and personal info (at least) are never deleted from the registry (except…)

16 CAMP Directory Workshop Feb 3-6, 2004 Registry record structure at UofC  RDBMS (Sybase) with tables for: –Each major source system –One in which to collect all “small feeds” –Individuals, one row per person –Tracking usernames –Supporting service baskets and (de-)provisioning –Supporting the security model for registry operations  DB-local primary key (not a GUID), no PVID  Records for “temporary” affiliations are removed

17 CAMP Directory Workshop Feb 3-6, 2004 Logging & reporting requirements  Audit –Who had which identifiers when –State changes (when using a stateful provisioning model) –Activity, to a degree  Diagnostic views/reports for selected helpdesk and operational staff  Refer requests for reports outside of the scope of IT operational needs to the data warehouse group!

18 CAMP Directory Workshop Feb 3-6, 2004 Provisioning strategy  Provisioning = maintenance of electronic ephemera required to facilitate users’ access to services  Format & transmission technology –Incremental vs. differential vs. full consumer rebuilds –Periodic vs. asynchronous updates –Per-consumer or standard record formats –Transmission techniques (what do you do?)

19 CAMP Directory Workshop Feb 3-6, 2004 Provisioning strategy  Service baskets –Business logic that determines which categories of people are entitled to participate in which services, with which service levels –One aspect of a more inclusive access control architecture –Egs: shell accounts & quotas, mailboxes, email forwarding, dialup profiles, vpn, wireless, computer registration, calendar, … –Issue of excessive granularization

20 CAMP Directory Workshop Feb 3-6, 2004 Not shown: transitions to prospective state from grace, limbo, slide, IDonly. Stateful provisioning

21 CAMP Directory Workshop Feb 3-6, 2004 Independent variables for state transitions  state  substate  date the present state was reached  date by which the present state might end (expiration date)  major affiliation (faculty, staff, enrolled student, accepted student, registered student, alum, …)  list of the identifiers of resources being managed for this account

22 CAMP Directory Workshop Feb 3-6, 2004 Fault avoidance & recovery  Bad source data arrives – what happens?  Flux high water marks –Hold update when # changes exceeds threshold –Possible in source side, more often seen in consumer provisioning techniques  “Semantical filters” –E.g. can absence from the HR feed mean anything other than they’re gone? –Construct source filters based on knowledge of business practices that relate to selection criteria on the source system.

23 CAMP Directory Workshop Feb 3-6, 2004 Fault avoidance & recovery  Person registry change log –Enables rollback & replay of consumer updates –Good diagnostic info –Supports a “hit me with the new ones” incremental provisioning strategy  Stateful provisioning model can be constructed to ensure continuity of service & buy time to fix effects of bad source data

24 CAMP Directory Workshop Feb 3-6, 2004 Expression of rules  Hard coded or abstracted rule syntax?  Rules for –Affiliation –State transitions –Inclusion in service baskets –Memberships in selected groups (“minor” affiliations, privilege classes)  Stanford, Memphis examples –Rules expressed in terms of registry object methods –External configuration file eval’d by the code that manages the registry

25 CAMP Directory Workshop Feb 3-6, 2004 Common consumers  Minimum set of consumers & consumer technologies needed to meet application requirements! –Authentication, attributes, groups, coordinated identity management  Types –Generic LDAP (maybe >1 replication networks) –Active Directory (maybe >1 consuming domain) –Kerberos –eDirectory, NIS, Ph, RDBMS (show hands?, others?) –Applications as direct consumers –Affiliated identity management operations

26 CAMP Directory Workshop Feb 3-6, 2004 Consumer identifier issues  Fundamental IDs –Choice of RDN (LDAP consumers) –Store/use pvid as a key field? –Persistence, visibility, opacity, …  Potential interaction with privacy policy  Representation of attributes –Determined by application use cases –Consumer specific selection & transformation? –Common overloading issues:  cn: name of person, name of group, name of service account, name of …  uid: more than one user namespace?

27 CAMP Directory Workshop Feb 3-6, 2004 UofC consumers  Consumers –openLDAP (1 replication network), Kerberos, Active Directory, NIS, Ph  uid is RDN  uid namespace issues: regular, temporary, hospital people –Above with periodic diffs, high water hold, async self & management updates –Peer ID Mgmt operations (periodic full)  Service baskets & statefulness being developed –Manual quarterly account closures suits UofC culture –Automated stateful approach to loss of services per- basket

28 CAMP Directory Workshop Feb 3-6, 2004 Selected policy & process issues  How will the University operate its identity management infrastructure? –What balance between centralized and distributed operation?  Registry – singular, centralized function  Consumers – high degree of distribution possible  Registration Authorities – small number?? –Who may have which role with what authority & obligations? –Leverages & extends existing data administration policies & processes, or begs if those are insufficient –Highly cross-functional activity demanding organizational flexibility

29 CAMP Directory Workshop Feb 3-6, 2004 Selected policy & process issues  What entitlements should attend each type of affiliation? –“Major” affiliations: student, faculty, alum, …  Possibly former or recent student, faculty, …? –“Minor” affiliations: in course 123, in department X, in degree program Y, occupant of building Z, … –What processes should determine entitlements for each affiliation?  How should affiliations be structured?

30 CAMP Directory Workshop Feb 3-6, 2004 Selected policy & process issues  Who should be issued a credential? What assurance level should authentication for each constituency achieve? What constraints may pertain to each? –Applicants (student, faculty, staff) –Admitted students, accepted faculty or staff –Alums –Parents –Library patrons –Guests: visiting academics, conference attendees, hotel guests, arbitrary “friends”, …


Download ppt "Data: Application requirements, data flow, and person registry Tom Barton University of Chicago."

Similar presentations


Ads by Google