Harvesting and Indexing Toolkit (HIT) Demo
A demo of the Harvesting and Indexing Toolkit (HIT - code.google.com developed by the Global Biodiversity Information Facility (GBIF - www.gbif.org). The demo shows how a user can harvest and index a dataset in Darwin Core Archive format (DwC-Archive - (rs.tdwg.org that has been published by an Integrated Publishing Toolkit (IPT - code.google.com and how that information could then be accessed/displayed in a portal, the example portal used in the demo is GBIF's own (data.gbif.org). Please note that the tool can also harvest datasets published via three other data publishing protocols DiGIR (digir.net , BioCASE (www.biocase.org and TAPIR (www.tdwg.org To understand the demo: The demo is divided into 2 parts: Part 1: Harvesting and Indexing a dataset using the HIT subpart 1: Synchronise with GBIF Registry - During this operation, a user synchronises with the UDDI registry and collects information about all the available dataset access points available. subpart 2: Collect Resource Metadata - During this operation, the user chooses a dataset, and performs a metadata update where metadata about the dataset is collected (dataset name, contact information, etc). You can see a quick shot of the IPT, and where the dataset's access point (URI) actually comes from. subpart 3: Harvest Data Records - During this operation, the user will schedule the harvesting to take place. In the case of DwC-Archive datasets, this consists of two steps: 'Download' and 'Process harvested records <b>...</b>