Estonian Information Society Yearbook 2011/2012. Karin Kastehein

Чтение книги онлайн.

Читать онлайн книгу Estonian Information Society Yearbook 2011/2012 - Karin Kastehein страница 7

Estonian Information Society Yearbook 2011/2012 - Karin Kastehein

Скачать книгу

would be to use, instead of the table/field name in the original system, the more general and common property name, if there is one. Schema.org is a suitable collection for searching for names. Note that if no suitable name is found, users will find it very easy later on to convert exported names to a form suitable for their purposes, on condition that they can understand the meaning of the exported field name.

      The topic of ontologies should be addressed as well in discussing how field names should be expressed. They may be viewed as rules for converting/classifying field names; for instance if we want to say that our field name in the form of the URI http://www.institution.ee/permitrecipients/naturalpersons/dob is precisely the same thing as schema.org’s Thing>Person>birthDate.

      From this view, the ontologies for publishing data are not a directly relevant or complicated topic; rather, they are more of a useful tool for application developers who mash up data from different sources. With regard to exporting data upon publishing, it would be expedient to write ontologies oneself either for documenting one’s field names or to convert a subset’s existing field names to schema.org names.

      Five-star formats. One way of linking data URIs is to use, instead of the identifier used in databases, a de facto more universal global identifier – the URI. Let us suppose that the state agrees on (or that the Population Register and Business Register adopt the use of) a format for personal identification codes and Business Register codes consisting of http://prefix1.ee/prefix2/personalIDcode and http://prefix3.ee/prefix4/companycode. In such a case, the five-star representation of personal and company codes would be through URIs where prefixes/URI formats are not the company’s own but, rather, the formats more broadly agreed upon. The same goes for names of database fields – names of object properties.

      How should a dataset be published in practice? There are three main technological ways of publishing data.

      • For human-readable files, the directories containing the files are packaged, a short content description is added and the packaged directory (directories) are uploaded in freely downloadable form advisably on the institution’s website, http://<asutuse domeeninimi>.ee/avaandmed (http://<institution domain name>.ee/opendata) or in the opendata.riik.ee directory.

      • In the case of data in databases, export the database content into a text format structured as xml or csv or json files etc and then implement the simple package-and-upload-to-web-server method. If the database contains personal data not subject to disclosure, the fields are simply not exported.

      • As an alternative, the data in the databases may be published as a free web service that can be used to find and download the entire content of the dataset or a filtered partial set. Network service can be SOAP service, but the most preferred ones are simpler, for instance json-based REST services, as well simply csv format-issuing services with get or post input parameters.

      The data must be described in reasonable manner, i.e. a person with no previous experience with the dataset but who understands the field and the technology must, with reasonable exertion, be able to understand the purpose, structure and content of the dataset.

      The dataset must include description of the principles for updating the dataset and the planned frequency of the updates. The publisher of the dataset has no direct obligation to regularly update the dataset – it is important to record the update plan (or lack thereof) in writing in comprehensible fashion.

      The datasets published by the institution must be easy to find. To do so, at least two means of publishing the existence, descriptions and download links to the data must be used.

      • A special directory on the institution’s own website, /avaandmed (/opendata), such as http://www.institution.ee/opendata.

      • National consolidated open data site/repository http://opendata.riik.ee.

      Open Spatial Data

      Kristian Teiter

      [email protected]

      Estonian Land Board

      What are spatial data?

      Simply put, spatial data are data with a geographic location and form. Such data are also called geodata, geoinformation and location data. As a rule, spatial data are presented and used in the form of a map that can be considered the spatial data output of a database. For instance, one of the outputs of the topographic data administered in the Topography Database of Estonia is topographical maps, but the data themselves can be used and made available online in xml format.

      Fields that account for the principal use of spatial data are environmental protection, planning, construction, logistics, transport, the military and statistics, to name a few. More and more potential is seen in location, and the use of spatial data in different walks of life is seeing explosive growth.

      Address data is also spatial data. It appears that, thanks namely to address data, a number of database administrators have recently been surprised to learn that their database contains spatial data.

      Publication and re-use

      Compared to Europe and the rest of the world, spatial data that can be treated as public information are public in Estonia. The Land Board, being the largest state map producer and administrator of spatial data, published cadastral unit data, administrative boundaries and topographic basic maps back in 2001 via the Web-based map server (http://geoportaal.maaamet.ee/). In 2008, Web-based map services (WMS) were added to the public map server; the Land Board’s maps can be accessed through these services using various GIS/CAD software. The map server and services are very popular, with a total of 500,000 visits per month.

      A second example we can cite is the Environmental Information Centre, whose website displays all sorts of environment-related spatial data. One example is the forest register’s web service.

      The difference between publication and re-use is mainly the fact that published spatial data can be viewed, queried, searched and, to a limited extent, downloaded as an image and printed. The possibilities for using spatial data in re-usable form are unlimited and as a rule they can be downloaded as databases in the suitable format. Making spatial data available specifically in re-usable form creates the preconditions for the private and third sector being able to use public sector information for developing new and interesting services and applications.

      With regard to availability of reusable spatial data, Estonia presents a variegated picture. Some re-usable spatial data covering all of Estonia are easily available; there are other kinds that are difficult to access for several reasons. Availability of data varies both from one agency to the next as well as within a given agency.

      Conformity to open data principles

      The following is my subjective assessment of the conformity of Estonia’s spatial data situation to selected open data principles. But the fact that these are spatial data does not have particular importance in this situation. Rather, the technological capability of database administrators, legal regulations, historical customs and other factors have influenced the situation.

      One principle is that data have been gathered from original sources without processing and they have retained their original form and level of detail. This is a principle that was more problematic in the case of spatial data 10-15 years ago, but today it is generally no longer an issue. It took time before spatial data users understood that it was in their own interests for spatial data from information holders to be from the original source and up to date. Paradoxically, the implementation of open data policy in Estonia could make the situation even worse. In using applications developed by third parties,

Скачать книгу