Message-ID: <627818149.29144.1444387172055.JavaMail.confluence@ece-vmapps> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_29143_530423778.1444387172054" ------=_Part_29143_530423778.1444387172054 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
The previous major update to this case study occurred in the first half =
of 2009. As recorded in the document entitled A Brief History of Metadata (in the ABS) (referenced simply=
as BHM hereafter), which is a=
ttached to this Case Study, the second half of 2009 saw fundamental decisio=
ns made by the ABS leading to initiation of the ABS Information Management Tran=
sformation Program (IMTP) in February 2010. While IMTP designates a spe=
cific program within the ABS, including a specific top level unit within th=
e organisation chart, the aim is for the ABS to achieve IMT (Information Ma=
nagement Transformation). All staff within the ABS have a role in achieving=
IMT will include fundamental reshaping of policies and strategies relate= d to metadata management developed by the ABS over the past two decades. At= this stage, however, IMT remains an early "work in progress".
IMT c= an be seen as focused on the "to be" environment for the ABS, in = terms of business architecture, data/information architecture and other ele= ments of enterprise architecture, as well as on the process for achieving t= he transformation (including business process re-engineering) required to r= ealise the "to be" state. At this time many details contained in = the previous version of this case study continue to accurately describe the= "as is" environment for metadata management within the ABS.
In the = initial update for 2011 it has been decided to focus the main body of the c= ase study on the "to be" state and initial steps toward that stat= e. Many details of the "as is" environment have been moved into s= upporting documents. Other aspects can be found by referring to the 2009 ve= rsion of the case study. It is possible within the wiki to view earlier ver= sions of each page. For convenience, also, PDF versions of the 2009 edition= of this case study and 2009 edition of BHM= a> have been made available.
The result of this approach is that the Case Stu= dy document is now shorter than the 2009 edition. Additional details will b= e added to the documentation as IMT progresses, including its new approach = to statistical information management, including metadata management.
It shoul= d also be noted that, as with all content in the METIS wiki maintained by A= BS practitioners, this is an informal working document shared with colleagu= es in the field of statistical information management. Unless unambiguously= indicated otherwise, no content accessed via these wiki pages should be co= nsidered to represent a formal statement on behalf of the Australian Bureau= of Statistics.
Ultimately any strategy exists to support the ABS mission and objectives=
as set out in the organisation's corporate plan. In particu=
lar, the availability of appropriate metadata and the application of sound =
statistical information management practices are critical to supporting inf=
ormed use of statistics and the quality of the statistical services we deli=
ver to the nation.
BHM provides informatio= n on the evolution of ABS strategies related to metadata over time, includi= ng extensive information in regard to Strategy for End-to-End Management of= ABS Metadata established in 2003.
IMT can be seen as superseding the 2003 str= ategy, although at this stage there is no "direct replacement" st= rategy focused specifically on metadata and its management. IMT focuses ins= tead on strategies related to "statistical information" managemen= t which spans metadata (in its broadest sense) and data.
Although IMT supersed= es the 2003 strategy, most of the fundamental ideas contained in the 2003 s= trategy remain relevant. For example, none of the twelve cornerstone princi= ples outlined in the 2003 strategy have been disavowed as irrelevant or ina= ppropriate. In this example, however, IMTP seeks principles
= This process of rationalisation is underway currently and it is expected th= e updated set of principles will be added to the case study once available.=
Mo= re generally, strategic planning for IMT can be seen as learning from exper= ience with the 2003 strategy (eg much slower progress, and much more mixed = success, in putting the strategy into effect than had been anticipated).
Well = defined, corporately accepted and supported, governance for information man= agement is much more of a foundation consideration for IMT. This includes c= learly established norms/principles/expectations, clearly established autho= rity and accountability and clearly established processes for assessing com= pliance and actively managing non-compliance. The corporate positioning of = IMTP (eg reporting directly to the head of the organisation and independent= of any one operational or support division) promotes its ability to addres= s governance requirements successfully compared with the implementation of = the 2003 strategy.
The IMT strategy of starting with the Metadata Registry Rep= ository (MRR) as the key enabling infrastructure, including its integration= with Statistical Workflow Management capabilities, can be seen as establis= hing a "central nervous system" to support the new environment = =E2=80=93 including supporting its relationship with "legacy" app= lications and repositories =E2=80=93 where the 2003 strategy primarily targ= eted developing new repositories and redeveloping existing repositories wit= hout such a well developed strategy for achieving "business integratio= n" in practice.
Compared with the 2003 strategy, IMT work on the Statisti= cal Information Management Framework also includes much greater integration= with external frameworks such as GSBPM (= Generic Statistical Business Process Model and GSIM (Generic Statistical Information Model) as well as wiith o= ther frameworks applied within the ABS (eg Enterprise Architecture).
In Octobe= r 2009, the ABS Executive formally agreed on Statistical Data and Metadata = Exchange (SDMX) and Data Documentation Initiative (DDI) as the standards th= at will form the core of the ABS's future directions and developments with = regard to statistical information management. This means strategic engageme= nt with the two standards communities, including encouraging them to co-ord= inate their work in order to support NSIs and others who seek to use both s= tandards, is a high priority.
Participation in the Statistical Network strateg= y is the primary, but far from only, example of IMT's strategic focus on co= llaboration (internationally and/or nationally) when it comes to statistica= l information management.
More detailed formal statements of IMT strategy in r= egard to statistical information management are still being reviewed within= the ABS. Any encapsulation of strategies which is agreed for general relea= se beyond the ABS will be added to this case study once available.
BHM<= /a> describes how the current situation has evolved within the ABS. Documen= tation of IMT outlines the current situation.
The majority of data collection = and input processing activities for business and household surveys have mov= ed toward implementation of high level metadata frameworks informed by ISO/= IEC 11179. These frameworks were developed over the past decade and postdat= e the ABS specific metadata framework which was implemented for the corpora= te output data warehouse which was developed during the 1990s.
Key elements of= current metadata infrastructure, which predate initiation of IMTP in 2010,= include major repositories related to
= The more recent developments also incorporate an approach to metadata regis= tration based on ISO/IEC 11179 Part 6. Even if some of the older repositori= es cannot be completely replaced in the next few years it is anticipated th= at a common high level metadata registration framework, harnessing the MRR,= can be implemented across the ABS for all classes of metadata. This does n= ot imply that all classes of metadata will undergo exactly the same registr= ation workflow, but the workflows for each class of metadata will be consis= tent with a higher level "metamodel" for registration.
Interoperabi= lity of the current ABS metadata models, including the legacy "output&= quot; model, with third party software (eg SAS, Blaise, SuperCROSS) continu= es to be an issue.
The increasing focus of the ABS and other agencies on the <= a href=3D"http://www.nss.gov.au/nss/home.NSF/" class=3D"external-link">Nati= onal Statistical Service (NSS) requires development of metadata models = and capabilities which are usable beyond the ABS. The NSS needs to interope= rate with agencies whose data content is more "administrative", &= quot;geospatial" or "research oriented" than "statistic= ally" oriented. This provides additional challenges and issues in rega= rd to metadata modelling.
While many of those agencies are at least as passion= ate about metadata as the ABS - but from a different "school" - t= he NSS also needs to support content producers and users for whom metadata = is much less of an interest and priority. This raises questions about minim= um metadata content and quality standards.
Understandably, metadata is a parti= cular area of focus for the NSS. This includes a simplified and generalised set of principles for m= anaging metadata.
Challenges associated with the current situation, such a= s achieving a coherent "end to end" metadata driven environment(s= ) within the ABS and better supporting the NSS, underpin IMT.
Since GSBPM (Generic Statistical Busin=
ess Process Model) reached full maturity with the release of V4.0 agree=
d in April 2009 it has been regarded as the preferred reference model for t=
he statistical business process within the ABS. This statu=
s for the GSBPM was confirmed with the area leading the enterprise architec=
ture initiative within the ABS at that time. The IMT Program launched in Fe=
bruary 2010 also confirmed this status for the GSBPM.
A number of reference mo= dels for the statistical business process existed, and were harnessed, with= in the ABS prior to the development of the GSBPM.
A particularly broadly appli= ed model was affectionately known as "The Caterpillar" within the= ABS. The Caterpillar was developed by the ABS as a reference model to supp= ort the Business Statistics Innovation Program (BSIP) launched in 2002.. It= allowed a disparate range of surveys and other statistical activities whos= e processes were (especially prior to BSIP) very different in detail to des= cribe what they did, why and how (eg what systems and data stores were used= ) in terms of a common high level reference point for the statistical life = cycle. It later allowed "leading practice" to be identified in di= fferent parts of the statistical cycle.
The broad relationship between the Cat= erpillar and GSBPM, documented previously in this section of the case study= , has been moved to a supporting page= .
E= xtensive process documentation, together with categorisations of informatio= n and even software interfaces, was developed during the course of initiati= ves such as BSIP and ISHS (both described in BHM). This activ= ity was undertaken based on the "pre GSBPM" reference models for = the statistical business process associated with those initiatives. It has = been agreed existing process documentation within the ABS will not be rewri= tten for the sole purpose of referring to the GSBPM. Formalising, and makin= g readily available, mappings between the GSBPM and the local reference mod= els has been particularly important.
The paper Applying the GSBPM within an NSI : Experiences and example= s from Australia, prepared for the METIS Work Session in = March 2010, provides more information in regard to ABS utilisation of GSBPM= as well as in regard to statistical business process models that preceded = GSBPM. Annex 1 of that paper provides a full description of the Caterpillar= .
K= ey points include
There are currently many systems within the ABS that encompass significa=
nt metadata definition and management aspects.
The MRR (Metadata Registry/Repo= sitory) associated with IMT is by far the most significant metadata system = currently under development. The MRR's Registry capabilities will act as a = "central nervous system" for systems across the ABS that define, = manage and use metadata. At this stage the MRR is at the Proof of Concept p= hase.
An early activity associated with IMT was the first phase of a "met= adata census". As suggested by selection of the term "census"= ; it was originally hoped that this activity would provide a much clearer a= nd more comprehensive "as is" picture of metadata management at a= local level within the ABS as well as at a corporate level. GSBPM was an i= mportant point of reference for indicating what phases and sub-processes ea= ch system was supporting with which metadata.
An early issue encountered was t= he ability for those responsible for systems to describe in a clear and con= sistent manner the "types" of metadata managed within their syste= ms. For example, if one system was said to work with "variables",= another with "data items" and another with "data elements&q= uot; were all three systems talking about the same "type" of meta= data, or about different "types" of metadata (that maybe related = to each other in some way)? Work associated with GSIM and the MRR should lead to this issue being more tractable i= n future.
The "as is" picture is also complicated the fact that many= local systems currently need to "replicate", possibly in a speci= alised format with local content additions, metadata held in existing corpo= rate repositories. The issue about consistently typing metadata compounds t= he issue of being able to establish which systems are managing which metada= ta simply because they currently can't source it from elsewhere =E2=80=93 a= s opposed to managing metadata for which that system should be considered a= n authoritative source within the ABS. A significant number of processing s= ystems currently have a secondary role as a "metadata system" onl= y because =E2=80=93 for a variety of reasons - they can't source the metada= ta they need systematically from elsewhere.
The second phase of the "meta= data census" focused in more depth on metadata associated with core co= rporate stores and systems. The outputs have already contributed to the des= ign of the Proof of Concept for the MRR and will contribute more broadly to= the development of the MRR in future as well as being used as one practica= l test of the scope and nature of metadata requirements addressed by GSIM. =
Ear= ly work on the metadata census confirmed that some current metadata systems= are
Information about key existing corpo= rate metadata systems, documented previously in this section of the case st= udy, has been moved to a supporting page.
In re= gard to systems envisaged for the future, the Statistical Workflow Manageme= nt (SWM) facility designed to work with the MRR is expected to provide a so= urce for information related to, for example,
Earlier conceptual and exploratory w= ork identified seven types of "process metadata" from "confi= guration" metadata about the IT environment and the user running the p= rocess, through to metadata which is a formal "input to", or &quo= t;output from" the process, through to metadata which describes the pr= ocess itself and which describes how chains of processes fit together. (Non= e of these seven types of process metadata corresponded to "process me= trics" as described below. Given there are already more than enough ty= pes of "process metadata", the ABS tends not to favour using the = term to also denote "process metrics".)
Achieving a clearer path for= ward in regard to structuring and managing "process" metadata is = seen as an important enabler to having other metadata (eg the structural de= finition of data elements) actively drive statistical processes.
It is intend= ed that the work related to structural definition and description of proces= ses harness appropriate standards such as BPMN (Business Process Model and = Notation) and BPEL (Business Process Execution Language).
It is anticipated th= at, through SWM working with the MRR, it will become possible to specify an= d analyse detailed information related to the statistical information used = by, and produced from, specific process steps.
A further priority is to better= capture and store (for automated and interactive analysis and reporting) &= quot;process metrics" related to how statistical processes are perform= ing (eg response rates, imputation rates, edit rates etc). Such data about = the outcomes of processes is sometimes referred to as "process metadat= a", "operational metadata" or (typically in specific circums= tances) "paradata" by others. Process metrics can be useful for i= nternal monitoring, management and tuning of processes as well as generatin= g data quality indicators for external dissemination.
Section 2.2 details infrastructure delivered as the result of diverse pr=
ojects, some of which first delivered outputs more than a decade ago. Lifec=
ycle costs and benefits are extremely difficult to even estimate meaningful=
= Costs and benefits for new developments and redevelopments were estimated = when developing business cases. While much better than a vacuum for plannin= g purposes, past experience suggests these cost benefit analyses were seldo= m borne out with any precision in practice. Often this was because decision= s were made over time to diverge from the original project plan in some way= rather than just because the original estimation process was flawed or bas= ed on imperfect information.
IMTP is instituting a much more rigorous approac= h to
For future developments, therefore, = more concrete information should be able to report in this section.
During for= mulation of the detailed business case for IMT, however, it is not appropri= ate for the ABS to release to the public domain the details of estimated co= sts and benefits associated with the program.
Information related to the implementation strategy can be gleaned from t=
he description of IMT (including resources linked to the page) and from Sec=
The challenges that provide the drivers for IMT must be addressed in= one form or another. In order to achieve the transformation in a timely ma= nner (eg in well under a decade), and realise maximum benefits for users of= ABS (and other NSS) statistics, significant resources in addition to those= allocated to undertaking and supporting current "business as usual&qu= ot; activities within the ABS will be required. This approach
The first generation of the informat= ion management framework and other enabling infrastructure such as the MRR,= together with generic tool sets, is required before the main transformatio= n (including re-engineering) across statistical production streams can begi= n in earnest. As has been the case for all elements of IMT, the main transf= ormation period will be planned in detail prior to commencement (eg which r= e-engineering for which statistical business process will occur at which ti= me during, eg, a four year period).
In terms of metadata management, the swing= ing of a pendulum can be seen to some extent in the BHM.
Developments in the 1990s tended to be on a "big bang&q= uot; basis. These were sometimes pejoratively referred to as "Ca= thedral Projects" for being too grandiose in ambition and design, and = for taking much longer and much more money to complete than originally expe= cted. Nevertheless, many of the results of these projects have proved to be= of enduring value - so much so that many outputs have lived on long beyond= their prime.
The strategy next (eg as formulated in 2003) became "opport= unistic" and "incremental". There was notionally a "mas= ter plan" of what should exist in the longer term, but individual &quo= t;construction projects" were much more modest in scale. Progress towa= rd the "master plan" was much slower, less direct and more diffic= ult than anticipated and hoped.
IMT is establishing a much clearer, more compe= lling, more widely shared and more actionable "master plan" toget= her with the active corporate mandate and governance to achieve progress. W= here the cathedrals of the 1990s tended to be largely designed and built in= isolation, the IMT approach focuses on collaborative and sharable solution= s underpinned by common standards and frameworks
A consistent learning has bee= n that a well developed and managed implementation strategy (in addition to= a development strategy) is essential. New capabilities are being delivered= into a complex context of existing processes and infrastructure. Uptake of= those new capabilities needs to be managed and promoted appropriately. (Th= e simple "Field of Dreams" approach of "Build it and they wi= ll come!" has never yet worked for us.) Often the new capability and/o= r the implementation and communication strategy for it, needs to be refined= based on early uptake experience. Whether it is managed by the development= team or some other team, every major project requires a well planned and a= ctively managed "Outcome Realisation" phase after it has finished= delivering its major outputs.
The ABS doesn't have a formal "taxonomy" of metadata. One was =
proposed early in development of the 2003 metadata strategy but it wasn't i=
ncluded in the final document. It was found that discussions about how to &=
quot;class" particular instances of metadata (in borderline cases rath=
er than all cases) could become very protracted without that discussion see=
ming to generate any real value.
In general ABS concurs with the findings of B= o Sundgren in regard to Classifi= cation of Statistical Metadata, namely that multiple valid approaches e= xist, with the optimum depending on why classification is being attempted. =
One= form of categorisation sometimes used within the ABS relates to purpose/us= e of metadata. This means a particular "piece" of metadata may (a= nd often should) support more than one type of use. The categories are
The ABS also recognises "object= s" in regard to which metadata can be assembled and registered. These = include
These "objects" can be fur= ther broken down (eg data elements into properties, object classes, value d= omains etc).
The main way forward from the ABS perspective at this time is wor= k toward GSIM (Generic Statistical Informat= ion Model). This should (among other things) provide a reference classi= fication (or taxonomy) of "information objects" (including "= metadata objects") that is shared in common beyond just the ABS.
Work on = the Metadata Census within the ABS is also providing a "bottom up"= ; approach to classifying/grouping "information objects" based on= the requirements of existing systems and processes (including seeking to h= armonise the sets of requirements and align them with constructs described = within SDMX and DDI). As described in Section 2.2, this work (together with= GSIM) will input to classing the objects supported by the MRR and also pro= vide a use based checklist for testing the GSIM "taxonomy".
The ABS actively aspires to manage metadata consistently throughout the =
statistical business process. The MRR to be delivered under IMT will have a=
crucial role in realising this objective in the medium and longer term.
As do= cumented in the supporting page for Section 2.2, consistent metadata relate= d to the identification of collections and cycles and the definition of cla= ssifications is already used widely (although not universally) throughout t= he statistical business process.
As the ABS does not yet have a definitive tax= onomy of "statistical information objects" (including "metad= ata objects") we do not yet have a definitive mapping of metadata used= /created at each phase of the statistical business process. Nevertheless, i= ndicative representations were created in 2006 and 2010.
The 2006 representati= on predates the ABS adoption of the GSBPM. The most legibl= e version of the diagram predates the final version= a> that was ultimately included in a briefing paper for senior management. = Unfortunately the source file for the final version of the diagram has not = yet been located and its reproduction from the briefing paper itself is har= d to read.
The 2010 working document has been loaded to the METIS wiki.
IMT has a focus on statistical information management rather than, eg, f=
inancial and human resource information within the ABS.
That said, information= about costs, information about organisational structures, people and their= roles are examples of information that can be relevant to managing and per= forming statistical business processes. More generally Statistical Informat= ion, Corporate Information and Business Information can be visualised as in= tersecting circles, and the intersections with Statistical Information are = certainly in scope for IMT.
Also, as long as it does not distract from the f= ocus on the statistical information used and/or produced in the course of v= arious sub processes within a statistical business process= , there is no reason why elements of the Statistical Information Management= Framework associated with IMT could not be applied to these other informat= ion domains.
A further connection is that a range of high level key performanc= e indicators in relation to the core business of the ABS are expected to be= able to be sourced via the MRR in future. These will assist in high level = corporate monitoring and reporting, on a more consistent and informed basis= , of efficiency, productivity, return on investments etc.
One reason IMT does = not have a primary focus on corporate information and business information = is that well recognised standards and frameworks (not specific to producers= of official statistics) already exist for these domains of information. A = more natural alignment in this case might be with Australian Government Arc= hitecture rather than the common reference architecture for producers of of= ficial statistics internationally.
These other domains of information are reco= gnised within ABS Enterprise Architecture (eg in terms of data/information = architecture). The redevelopment of information systems related to human re= source management is an example of an "architecturally significant&quo= t; project in this regard.
In this regard, the IMT approach parallels the 2003= metadata strategy which defined its scope as relating to "statistical= " metadata (rather than all the metadata potentially relevant to any a= spect of ABS operations).
ABS Enterprise Architecture harnesses The Open Group Architecture Framew=
ork (TOGAF) which recognises domains of business, data, applications and te=
chnology architecture. In describing "IT Architecture" below, ref=
erence is primarily made to applications and technology architecture. Conne=
ctions with data architecture are also explored.
Unless otherwise noted, descr= iptions in this section refer back to the main metadata systems as describe= d in Section 2.2.
The newer metadata facilities are based on a Service Oriente= d Architecture. The older facilities tend to have monolithic coupling of th= e repository, the business logic and business rules (which are built into t= he application rather than embedded in services) and the User Interface.
Never= theless, selected information about the collections defined in CMS is "= ;projected" from CMS into an Oracle database. While only a small subse= t of the total information held in CMS, this comprises all of the core &quo= t;structural" registration details about collections, cycles and profi= les. Basic (read only) "collection metadata services" based on th= is content on Oracle are then provided for statistical processing applicati= ons to access.
A similar approach applies in the case of classifications excep= t a much greater percentage of the total information held in regard to clas= sifications is both "structural" and available on Oracle.
A= part from CMS and ClaMS (which include some descriptive content held only i= n IBM's Lotus Notes product) the other metadata holdings are all based in O= racle. There is extensive use of Oracle Stored Procedures for reusable serv= ices/functions and some use of true web services.
In summary, more recent= ly developed facilities based on recent architectural standa= rds within the ABS, tend to consist of
= While SOA offers a lot of opportunities and potential, it also comes with a= lot of new complexities compared with earlier approaches. It requires new = understandings and a new mindset from those developers who are being asked = to take up, and interact with, the available services as well as requiring = the same from the business analysts and programmers within the team respons= ible for providing the metadata repositories and services. It can make the = overall environment More complicated in some ways (eg services are calling = services that call services etc and then somewhere at a low level a service= is updated and everything needs to be configured appropriately to allow pr= oper testing of that change). Implementing SOA in environments that include= a lot of "legacy" processing systems that are not enabled for th= e new architectural directions is particularly challenging
During 2008 it beca= me clearer that a significant aspect of the work on establishing an updated= and coherent metadata framework for the ABS amounts to defining Enterprise= Information Architecture (EIA) in the context of a statistical organisatio= n. Without a clear and coherent EIA, there is a risk each service, or each = bundle of services, is delivered with its own explicit or implicit informat= ion model. The ABS could have gone from having a dozen or so environments w= ith subtle and not so subtle differences in their underpinning information = concepts and structures to having an array of services based on a plethora = of different, and unreconciled, information models. On the positive side, S= OA can help make EIA practical and consistent. Rather than having the same = objects and relationships specified in the EIA implemented, and extended, d= ifferently across a number of different environments, a single consistent b= ut flexible bundle of services could be used within each environment. SOA a= nd EIA are complementary rather than alternative directions.
The IMT strategy addresses the requirement for SOA an= d EIA to work together. It enables common information constructs, defined a= ccording to schemas aligned with relevant standards such as SDMX and DDI, t= o be used consistently via service layers. These service layers enforce cor= e business rules. They also mean application developers can work with infor= mation objects at a business level without needing to understand, and code = based on, the full details of the SDMX and DDI information models. The inte= gration with Statistical Workflow Management is also an important element o= f the "to be" IT Architecture.
Statistical processing applications interact with metadata via services =
where possible. As described in BHM, ho=
wever, many ABS processing applications and third party vendor products are=
not yet amenable to this approach. Where this approach is used currently i=
t most often involves the application "reading" relevant content =
from the metadata repository rather than writing back new or updated record=
= The = IMT strategy seeks to fully, and consistently, realise this approach. S= ome existing key applications (and repositories) may need to be "wrapp= ed" so they can interact with the MRR on a CRUDS basis. ("S"= refers to harnessing the MRR Search capabilities to support discovery, sel= ection of relevant content to Read etc.). Other legacy applications may nee= d to be decommissioned, through delivery of services and interfaces that ta= ke their place, and content from a number of legacy repositories will need = to be migrated to the (logically) centralised repositories associated with = the MRR.
In the meantime, as described in the introduction to 2.2, there are c= ases where metadata from the Corporate Metadata Repository needs to be rest= ructured and/or repackaged relatively manually to make it suitable for use = in particular processing systems.
Standards and formats currently in use for major metadata repositories a=
re described in Section 2.2.
Under IMT,= the primary standards are SDMX and DDI, interoperating with other "pu= rpose specific" standards such as
Regardless of which standard's infor= mation model is being harnessed, content for interchange (eg to be read by = applications) is typically represented in XML. In order to reduce the need = to exchange large XML structures, where only a small proportion of the tota= l information may be needed for a particular application, the XML used to d= escribe an object can refer to sub components and related objects "by = reference" rather than including all this information "in line&qu= ot;. The calling application can then resolve the specific references (if a= ny) which are relevant to its particular needs =E2=80=93 once again typical= ly resulting in smaller packages of XML than would be the case if a compreh= ensive set of information related to the component was included "in li= ne".
While XML is used for interchange, current repositories tend to stor= e content using RDBMS (relational database) technology. XML stores and grap= h databases are technologies being considered for future to augment RDBMS a= pproaches.
Expression in RDF format (which builds on simple XML representation= ) is seen as an important additional capability in future. This is seen as = one advantage of harnessing standards =E2=80=93 in many cases the community= for a standard has already developed a recommended expression in RDF.
The approach to versioning has been a major point of debate within the A=
BS previously. As the systems have grown up at different times, their appro=
ach to version control tends to differ.
In general, where there was not seen t= o be a compelling case for supporting formal versioning past developments t= ended to avoid that "complexity". Collections, for example, are n= ot currently versioned. Many aspects of change over time for a collection, = however, can be handled through descriptions of the "cycle" or th= e "profile" rather than edits to the main collection document its= elf.
Under IMT, however, versioning is seen as a perquisite for active use and= reuse of metadata. The structural definition of a metadata object at the t= ime it was referenced must remain accessible even if a new version of that = object is defined subsequently. This is consistent with the approach taken = in standards such as SDMX and DDI. Both of these standards have a concept o= f objects being able to be in "draft" mode in which case they sho= uld not be referenced for production purposes. The standards do not require= versioning of drafts but it is likely that the MRR will support versioning= of drafts.
Past debates over when a change is so fundamental that it should r= esult in definition of a new object, rather than a new version of an existi= ng object, remain to be addressed in the IMT context.
Past debates about chang= es that are so "trivial" (eg fixing a spelling mistake) that they= shouldn't result in version change also remain to be finalised in the IMT = context.
An example of problems from lack of appropriate support for versionin= g in current infrastructure is classification system. It could benefit, for= example, from the Neuchate= l approach to modelling classifications, versions and variants as well = as the IMT approach to not overwriting previous content.
Within the current sy= stem each registered object is essentially an independent entity (ie a &quo= t;new classification"). It is possible to designate one classification= as being "based on" another but this can mean many different thi= ngs
= Where revisions are to be made (or new versions created) as much impact ana= lysis as possible is undertaken. This includes, for example, understanding = what other metadata objects and processes refer to the object that is about= to be revised (or versioned) and whether the revision will have any inappr= opriate impact (whether the new version should be referenced instead). The = lack of fully "joined up" registries (including knowing exactly w= hat metadata is referred to in each processing system) makes impact assessm= ents difficult and only partially reliable in some cases.
The MRR and Statisti= cal Workflow Management working together in future should greatly assist in= this regard. While existing metadata objects and business processes will b= e able to continue referencing the present version of an object that is pro= posed to be updated/versioned, understanding these existing uses and the re= quirements associated with them
= The preceding example illustrates the flow on impacts that versioning can h= ave within a complex and actively used metadata registration system. If the= existing metadata objects that refer to the object that just got "ver= sioned" now need to refer to the newer version of that object, all tho= se existing metadata objects themselves now need to get "versioned&quo= t; (because they're pointing to a different version of the first object). A= ll the objects that refer to the objects that referred to the original obje= ct now need to get assessed and potentially versioned themselves, and so on= with a ripple effect potentially sweeping across the whole registry origin= ating from just one object being versioned. (While standards such as DDI-L = support the option of "late binding", they recommend against it f= or many purposes. Under "late binding" a reference to another obj= ect is always deemed to refer to the most recent version of that object =E2= =80=93 rather than, eg, to the specific version of the object that was curr= ent at the time the reference to it was made. "Late binding" redu= ces precision and leaves open the possibility that the object referred to w= ill subsequently "evolve" in ways that contradict the initial bas= is for referring to it.)
The IMT approach supports user decision points (which= may be manual or automated) in regard to the "ripple effect" of = versioning. It also provides the greatest systematic support for managing i= nitial and "consequential" versioning processes.
While external expert consultants were engaged from time to time, the ex=
isting metadata systems described in Section 2.2 were all designed and deve=
loped "in-house". Open source and other starting points for the D=
ata Element Registry were seriously considered.
ABS (and the Australian Govern= ment) ICT Policy and Strategy is placing a greater emphasis on COTS (Commer= cial Off The Shelf) & GOTS (Government Off The Shelf) based. "Besp= oke" software developments (whether through in house development or co= mmissioning of external developers) to deliver all, or part, of a solution = is seen as a last resort if other options are demonstrated not to be viable= .
F= rom an ABS perspective, however, it remains typically the case that in hous= e staff
= Ensuring solutions are consistent with Enterprise Architecture, including S= ervice Oriented Architecture and support for relevant open standards, promo= tes effective integration (with minimum need to re-engineer other systems),= reduces risks of "vendor lock" and facilitates end of life decom= missioning (and possible replacement).
The approach to IMT aligns with these I= CT strategies and policies. This includes
= While not all developments related to IMT will necessarily deliver, or harn= ess, open source components, open source is recognised as one important par= adigm for sharing solutions and sustaining their evolution over time.
In addit= ion to seeking to collaborate with other agencies, ABS is drawing on input = from expert consultants to assist developers understand and apply informati= on standards such as SDMX and DDI and to assist in designing key infrastruc= ture such as the MRR.
Development of REEM= (Remote Execution Environment for Microdata) is an example of the ABS = working which a vendor that shares our interest in harnessing standards suc= h as SDMX and DDI-L. Elements of the REEM solution include
ABS implementation, as ABS.Stat, of the OECD.Stat platform is an example of harn= essing an existing standards aligned shared solution and entering into a co= llaborative partnership (with OECD, IMF, Statistics New Zealand and Istat) = to maintain and evolve that solution in future.
At present, many systems (as described in section 2.2) used by the ABS a=
re built in a "monolithic" fashion (combining the repository, the=
business logic and the user interface) and are highly customised for the A=
BS environment (eg they rely on both IBM Lotus Notes and Oracle databases w=
hich are configured in a particular way). CMS, ClaMS and the Dataset Regist=
ry are all in this category. While there is no in principle objection to sh=
aring these components with other agencies, doing so in practice would be v=
ery complex both for the ABS and for the other agency. In any case, as thes=
e facilities were developed more than a decade ago and predate relevant app=
lication architecture and metadata standards, it is not anticipated any oth=
er agency would be interested in making use of these facilities in their cu=
Newer facilities such as the Data Element Registry (DER) and Quest= ionnaire Development Tool (QDT) are architected in a manner that would make= it easier to share them. Both of these facilities are designed so that a u= ser interface interacts with the Oracle database via a "Business Servi= ces Layer" (BSL). In addition to full sharing, partial sharing could b= e supported (eg the ABS providing the repository and BSL, with the other ag= ency choosing to develop its own user interface.)
Sharing could be envisaged i= n at least two forms. One would be the ABS packaging either the full facili= ty or some layers from the facility in a form which allowed another agency = to establish a "stand alone" instance. A second form would be ext= ending the BSL (and probably repositioning the repository) so that authoris= ed and authenticated interactions from outside the ABS became possible in r= egard to the current instance of the facility. One or more external agencie= s might then act as registration authorities in their own right. This could= have many benefits in terms of sharing, and shared development of, metadat= a content but would be likely to require more thought in terms of ongoing g= overnance and support arrangements.
A third possibility, which physically &quo= t;cloned" the repository (ie the first option) but supported a unified= logical perspective across the original repository and the clone(s) (ie el= ements of the second option) would also require significant additional work= .
W= hile these facilities are deliberately more compartmentalised and self cont= ained in design, they were not developed from the ground up with the intent= of sharing beyond the ABS. Some generalisation of ABS specific aspects (eg= linkages of both the DER and QDT to collection information from the CMS) w= ould still be required.
The software the ABS has available should be able to b= e made available to other statistical agencies free of charge in its curren= t form. If the ABS needed to modify the software and/or provide consultancy= support in order for that software to be made operational outside the ABS = then that work may need to be cost recovered. Alternatively, and preferably= , it may be possible to agree a collaborative arrangement such that the exi= sting facility is extended and generalised in a manner that benefits both t= he ABS and the other agency.
The ABS seeks to avoid becoming a "software= house". Any sharing arrangements would be in the context of either on= e off provision or, preferably, some form of partnership. A relationship al= ong the lines of the ABS acting as a provider to one or more "customer= s" does not fit with current ABS aspirations and directions.
A number of = other ABS applications (eg ABS Autocoder and REEM) are also listed in the S= haring Advisory Board's inventory of software available for sharing<= /a>.
Short of sharing software itself, the ABS is very happy to exchange detai= ls of data models, application architectures, user experiences etc with oth= er statistical agencies.
New developments such as the MRR are being designed t= o be more readily sharable, in whole or part.
While the ABS has relatively few= components currently that other agencies may be interested in sharing, the= ABS is placing a very high priority on establishing collaborative partners= hips with other agencies to develop new components, or to extend existing m= odern standards aligned components that already exist outside the ABS.
Initiation of IMTP in February 2010 led to significant adjustment of rol=
es and responsibilities within the ABS.
The 2003 metadata management strategy = had stated that, in terms of governance
Metadata management becomes part o= f every project and each project ensures that they consider and budget for = resources to handle metadata development and maintenance.
It is sometimes= suggested that by making something "everyone's business" it beco= mes nobody's business.
The Data Management Section (DMS) within the ABS was to= be "consulted" and had a co-ordinating and advisory role. The ai= m was that the Corporate Metadata Repository (CMR) and its services would b= e progressively extended to meet the needs of new application developments.= DMS developed guidelines to assist project planners, project managers, bus= iness analysts and IT staff in understanding the practical meaning and inte= ntions of the principles and how they might apply in the context of a speci= fic project. DMS also provided direct interactive advice to planners, analy= sts and IT staff.
In practice, however, the design and development of new meta= data repositories and services were driven by the initiatives that required= them, and paid for them, such as BSIP and ISHS (see BHM). Given input from DMS, architectural design pan= els and other sources the designs were notionally left open for use by othe= r projects, and for integration within the CMR, but these outcomes were giv= en relatively low priority in practice.
DMS also continued to fulfil roles it = had prior to the 2003 strategy. DMS has been responsible for Data Managemen= t Policy within the ABS and maintaining the ABSDB and selected other infras= tructure such as CMS and ClaMS described in the supporting page for Section= 2.2 of this case study. The maintenance role includes
While DMS ensured necessary "re= pository infrastructure" was provided, and that the infrastructure rem= ained "fit for purpose" in a changing organisational and technica= l environment, it is not responsible for the quality of the content held wi= thin each repository. That responsibility rests with the subject matter are= as and others who provide the content and have an ongoing custodianship res= ponsibility, including ensuring the content remains up to date and answerin= g any enquiries its definition might generate from others.
Data Management Pol= icy mandates use of the corporate facilities for various purposes and subje= ct matter areas are responsible for making use of the facilities in accorda= nce with those policies.
The Standards and Classification Section (SCS) has a = number of leadership roles in regard to metadata content within the ABS. SC= S develop and support "standard" classifications and variables wh= ich are cross domain in nature (eg industry, occupation, language). Many of= these are recognised standards for Australia as a whole, not just the ABS.= SCS also provide guidelines and advice to help subject matter areas ensure= their "collection specific" metadata is well defined and curated= .
D= MS and SCS form the Data Management and Classifications Branch (DMCB) withi= n the Methodology and Data Management Division (MDMD). DMCB brings together= specialists in metadata modelling and systems with specialists in metadata= content, in order to reinforce each other's work and to provide strong int= egrated support to the ABS and the broader National Statistical Service.
With = announcement of the IMTP, an early matter to be clarified was the nature of= the new program's relationship with MDMD - and DMCB in particular. The con= clusion was that IMTP would assume leadership at the strategic level in reg= ard to (Statistical) Information Management. The Program Board for IMT, for= example, consists of the head of the ABS and his four deputies. This Board= is therefore able to address organisational governance and alignment issue= s, including in regard to Statistical Information Management, that the appr= oach to implementing the 2003 strategy had been unable to address in practi= ce.
= Naturally the IMTP leadership role entails working closely with MDMD. It i= s recognised, also, that IMTP is leading a transformation process (which, f= rom July 2011, is expected to take at least six years to complete). At the = conclusion of that transformation process IMTP is not expected to continue = as an organisational unit in its current form. Strategic leadership therefo= re needs to transition to a sustainable arrangement within the "post I= MT" organisational structure.
As described in IMT, this leadership role i= s reflected in activities such as development of the Statistical Informatio= n Management Framework, design work associated with the MRR and leadership = of the international OCMIMF collaboration. A team of information management= specialists, and business analysts specialising in information management = systems, exists within IMTP. At the current time (July 2011) this team comp= rises half a dozen staff. It is expected to approximately double in size du= ring the coming year..
DMS staff have been seconded to IMTP on a rotating basi= s to assist with its IM work program.
<= br class=3D"atl-forced-newline" /> In addition, DMS has notionally divided = its work program between maintenance of existing infrastructure (as describ= ed above) and supporting IMTP through
There are around a dozen staff withi= n DMS currently, with their duties split fairly evenly between maintenance = of existing infrastructure and supporting IMTP.
A third key area (beside IMTP = and DMCB) is SISD (Statistical Infrastructure and Solutions Design) unit wi= thin the technology oriented division of ABS. One role of SISD is to provid= e technical leadership and support in regard to Enterprise Architecture, in= cluding data/information architecture. SISD also leads and supports the &qu= ot;solutions design" process within the ABS, ensuring that new develop= ments (particularly those that are classed as "Architecturally Signifi= cant) are designed with due regard to agreed architectural principles and p= ractices. Alignment with the "to be" business and data/informatio= n architectures, whose definition is emerging from IMT, is a key considerat= ion in this regard. The "Metadata Building Code" developed by DMS= during 2010 currently provides guidance in this regard.
The solution design p= rocess culminates with a formal Design Review that comprises senior executi= ves from Technical Services Division, IMTP and relevant business stakeholde= rs. Where an appropriately consultative solution design process has been fo= llowed prior to the formal Design Review, however, key "architectural = concerns" from various perspectives should already have been identifie= d by stakeholders and addressed in the design proposal. The Design Review s= hould serve as a formal gate to confirm the solution design process has bee= n conducted appropriately, and confirm high level support for the solution = proposed, rather than result in fundamentally new concerns being identified= .
I= n addition to these key organisational units (IMTP, DMCB and SISD) there ar= e a range of governance, reference and advisory groups that include partici= pants from across the ABS. These groups assist in steering and informing AB= S priorities and directions related to Statistical Information Management. =
Pha= se 5 of IMT will focus on extending facilities to support the discovery of = and access to data within the NSS. In the meantime, however, the Data Leade= rship Initiative (DLI) is being sponsored by NSSLB (National Statistical Se= rvice Leadership Branch). DLI aims to promote within the NSS best practice = standards to help ensure data is 'fit for purpose' for statistical use. Thi= s includes best practice in application of exchange standards such as SDMX = and DDI. NSSLB is working closely with IMTP and DMS in regard to these aspe= cts of DLI.
The following table contains a list of specialists in metadata man= agement in the ABS:
Role/Position in ABS=20
Chief Statistical Information Architect - IMT= P
+61 2 62525416
Director - DMS
+61 2 62526300
Director - SCS
+61 2 62525920
Chief Architect - SISD
+61 2 62526736
Director, Statistical Coordination, NSSLB=
+61 3 96157500
General training in regard to IMT, including the future for statistical =
information management, is only starting to be developed and remains at a g=
Capability building for information management specialists and I= T developers in regard to SDMX and DDI has been a priority since the decisi= on of the ABS Executive in October 2009 that these standards will form the = core of the ABS's future directions and developments with regard to statist= ical information management. To date this has primarily been achieved throu= gh engaging international experts to present structured courses and worksho= ps. On line learning packages and other training materials developed overse= as have also been researched and evaluated, and then utilised where appropr= iate. It is planned to "train trainers" within the ABS to be able= to deliver basic and intermediate (but not necessarily advanced) training = in regard to these standards (and their application within the ABS) in futu= re.
= Several of the deliverables from the current activities being undertaken b= y IMTP (eg the Metadata Registry/Repository, the Statistical Information Ma= nagement Framework) create training needs in order for these outputs to be = harnessed appropriately by business and technical users.
In regard to existing= infrastructure, DMS provides a range of training.
In addition, a Corporate Me= tadata Repository (CMR) Assistant is available from the home page of the AB= S intranet. This provides a portal to overview and detailed information abo= ut the available facilities as well as related policies, guidelines and tra= ining courses. It also provides direct access to the facilities themselves = by allowing users to click on the component of interest as represented in a= high level diagram showing how the various facilities fit together.
As the CM= R is "part of the way the ABS does business", the generic trainin= g offered by DMS is only one strand. The training about dissemination proce= sses in the ABS, for example, includes information about how content define= d in the CMR can be drawn into the various dissemination channels and made = available outside the ABS. DMS provides development assistance and input on= the components of these training courses that relate to the CMR.
Similarly th= e corporate "Assistants" related to Business Statistics, to House= hold Surveys and to Publishing cross reference relevant content from the CM= R Assistant where appropriate.
The strategy of presenting information about th= e CMR in the context of a particular wider business process, rather than tr= ying to present everything about it exhaustively in a major CMR specific tr= aining program, appears to work well.
The major partnership specifically related to metadata management, as de=
scribed under IMT, is the ABS work on the OCMIMF Collaboration with five ot=
Given the ABS Executive decision in regard to SDMX and DDI in Octobe= r 2009, the ABS has a strong interest in how effectively and efficiently th= ese standards work together, both currently and into the future. The ABS is= therefore a very active participant in the SDMX/DDI d= ialogue process which also engages the two standards bodies together wi= th a number of other NSIs and international agencies.
While developments such = as the MRR and the Statistical Information Management Framework are not for= mally structured as collaborative projects, plans and experiences in regard= to them are shared at an informal working level. Informal interchange is p= articularly common with agencies that are undertaking similar developments = which harness SDMX and DDI working together.
More generally, the ABS is very k= een to share information and experiences and to collaborate within METIS ge= nerally as well as on a narrower (eg bilateral or "working group"= ) basis.
At a national level, ABS is undertaking a number of metadata related = projects in conjunction with ANDS (Australian National Data Service). The primary focu= s for ANDS is infrastructure to better support the data management and acce= ss requirements of researchers. Public Sector Information, including statis= tical information from the ABS, is a key information resource of interest, = and value, to the research community.
<= br class=3D"atl-forced-newline" /> The National Statistical Service (NSS) p= rovides many opportunities for other collaborations. These include working = with State and Territory Government agencies that are undertaking major dat= a related initiatives as well as working with sector specific initiatives (= eg the Australian Transport Data Action Network) that span agencies at the = State and Territory as well as the Australian level. NSS initiatives take t= he ABS beyond simply collaborating with other statistical agencies and into= collaborating with other metadata communities, such as the geospatial comm= unity, the research community, and others.
One collaborative project, for exam= ple, with a state government agency and the university sector involved deve= loping "injectors" for technical metadata about usage rights unde= r the Creative Commons framework. The software allowed information on usage= conditions to be "injected" into spreadsheets and other products= so this information remained associated with the content even after it had= been downloaded from the web. The Creative Commons organisation itself has= now expressed interest in assuming responsibility for ongoing custodianshi= p and development of the software.
Over the past 15 years the term "metadata" has become common p=
arlance within the ABS. The value and importance of metadata is widely reco=
There is also a degree of disappointment, frustration or scepticism ex= pressed in some quarters because more progress hasn't been made more quickl= y and we haven't yet made metadata simple to manage and maintain as well as= "all powerful" in driving and describing all processes and outpu= ts. The vision expressed in the 2003 metadata strategy to some extent foste= red expectations that were unable to be met during the subsequent years of = implementation. Questions have been raised in regard to what is different a= bout IMTP which will allow larger scale success this time.
As illustrated else= where in this case study, however, the corporate positioning of, and suppor= t for, IMTP is incomparably stronger than the positioning for implementatio= n of the 2003 strategy. The profile of IMTP has led to much more active bus= iness (including senior executive) engagement from across the ABS in shapin= g the IMT strategy and its expression. It is a corporate initiative, driven= by business strategy and requirements, rather than an initiative driven (i= n reality or in terms of common perception) by IT and/or IM specialists. In= addition there are enablers (eg mature standards and technologies) capable= of supporting IMTP that did not exist eight years ago. Partly because of t= hese enablers, great strides have been made within the wider community of p= roducers of official statistics which mean IMT is able to harness collabora= tion, and shared solutions, in a manner that was not possible in 2003. In a= ddition, IMT learns from past ABS experiences in this field - and the exper= iences of other agencies.
The fact the term "metadata" is so widely = used, in a variety of valid but different contexts, is emerging as an issue= . The focus on IMT on "statistical information" rather than speci= fically "data" or "metadata" is seen as an advantage in= this regard. At a minimum most references to "metadata" in discu= ssions within the ABS, or outside the ABS, require clarification of which t= ype(s) of metadata is being referenced.
Being primarily aware of low level tec= hnical examples, some managers are unsure why metadata should be considered= a strategic business challenge and enabler within the ABS rather than a pu= rely technical matter. Once again, a focus on "statistical information= " (which is the core business of the ABS as an NSI) can be helpful.
Simil= arly there is frequent confusion between "metadata concepts, models, s= ystems etc" and "metadata content". It is challenging to pro= mote a message that investment in well designed and integrated metadata inf= rastructure is a necessary, but not sufficient, condition for achieving con= sistently high quality of metadata content. Senior managers have tended to = have unrealistically high expectations of what will be delivered - which wo= uld lead to disillusionment if not addressed in advance - or else their exp= ectations are so low that they are unwilling to commit resources to the eff= ort.
Very significant challenges arise from the fact staff often enjoy the cha= llenge, and receive satisfaction, from developing definitions, structures, = frameworks etc from first principles. They often also find it hard to resis= t the temptation to "tweak" the wording of a definition, the deta= ils of a structure etc that they already recognise as basically fit for pur= pose but which they believe could be improved upon slightly for their speci= fic purpose. This can be seen as part of a culture of "local optimisat= ion" rather than "global optimisation". A series of poorly i= ntegrated local optimisations, however, may result in an inefficient, sub-o= ptimal end to end business process. In addition, a diversity of "local= ly optimised" processes/systems across the organisation typically prov= es very hard to sustain over time.
Seeking of "local optimisation" b= y employees can be linked to a sense of professionalism and pride in their = work. It is vital not to undermine the latter two when seeking to address t= he former. Aiming for "local optimisation" also tends to be simpl= er than seeking global optimisation.
It is also the case that simple reuse isn= 't always the answer. Sometimes local divergences are appropriate even when= viewed from a wider perspective. The trick becomes identifying when this i= s the case. Such cases typically require "designing the divergence&quo= t; such that re-use of existing concepts and content is maximised, with the= divergence being only to the extent required. This becomes a difficult bal= ancing act. There is a temptation to revert to "starting with a blank = slate" as soon as it becomes apparent re-use will not be simple.
As illus= trated in the preceding two paragraphs, there is scope for the aim "th= ink globally, act locally" to create even more challenging and satisfy= ing roles for staff, but the extent of the cultural change required to reac= h that point appears daunting.
Exactly the same "local optimisation"= issues described above in regard to subject matter staff reusing metadata = structures and content have been observed in terms of programmers re-using = existing services as part of Service Oriented Architecture.
6.1 Lessons Learned