2011年1月24日 星期一

都柏林核心集的最佳實務指引

數位典藏聯合目錄
柏林核心集
匯入聯合目錄

參考來源:

"都柏林核心集的最佳實務指引"
- DC元素 (在「Google 網頁註解」中檢視)

2011年1月21日 星期五

10.9.3. Harvesting API

10.9.3. Harvesting API
The org.dspace.search package also provides a 'harvesting' API. This allows callers to extract information
about items modified within a particular timeframe, and within a particular scope (all of DSpace, or a community or
collection.) Currently this is used by the Open Archives Initiative metadata harvesting protocol application, and the
e-mail subscription code.
The Harvest.harvest is invoked with the required scope and start and end dates. Either date can be omitted. The
dates should be in the ISO8601, UTC time zone format used elsewhere in the DSpace system.
HarvestedItemInfo objects are returned. These objects are simple containers with basic information about the
items falling within the given scope and date range. Depending on parameters passed to the harvest method, the
containers and item fields may have been filled out with the IDs of communities and collections containing an
item, and the corresponding Item object respectively. Electing not to have these fields filled out means the harvest
operation executes considerable faster.
In case it is required, Harvest also offers a method for creating a single HarvestedItemInfo object, which
might make things easier for the caller.

10.9.2. Indexed Fields
The DSIndexer class shipped with DSpace indexes the Dublin Core metadata in the following way:

10.9.1. Our Lucene Implementation

10.9.1. Our Lucene Implementation
Currently we have our own Analyzer and Tokenizer classes (DSAnalyzer and DSTokenizer) to customize our indexing. They invoke the stemming and stop word features within Lucene. We create an IndexReader for each
query, which we now realize isn't the most efficient use of resources - we seem to run out of filehandles on really heavy
loads. (A wildcard query can open many filehandles!) Since Lucene is thread-safe, a better future implementation
would be to have a single Lucene IndexReader shared by all queries, and then is invalidated and re-opened when the
index changes. Future API growth could include relevance scores (Lucene generates them, but we ignore them,) and
abstractions for more advanced search concepts such as booleans.

10.9. Search





2.16. OAI Support

2.16. OAI Support
The Open Archives Initiative(開放典藏計劃) [http://www.openarchives.org/] has developed a protocol for metadata harvesting(後設資料的擷取)[http://www.openarchives.org/OAI/openarchivesprotocol.html]. This allows sites to programmatically retrieve or 'harvest' the metadata from several sources, and offer services using that metadata, such as indexing or linking services. Such
a service could allow users to access information from a large number of sites from one place.
DSpace exposes(公開) the Dublin Core metadata for items that are publicly (anonymously) accessible. Additionally, the collection structure is also exposed via the OAI protocol's 'sets' mechanism. OCLC's open source OAICat [http:// www.oclc.org/research/software/oai/cat.shtm] framework is used to provide this functionality. You can also configure the OAI service to make use of any crosswalk plugin to offer additional metadata formats, such as MODS.
DSpace's OAI service does support the exposing of deletion information for withdrawn items(制定項目), but not for items that are 'expunged' (see above). DSpace also supports OAI-PMH resumption tokens.

2.14. Search and Browse

2.14. Search and Browse
DSpace allows end-users to discover(找到) content in a number of(許多) ways, including:
• Via external reference, such as a Handle
• Searching for one or more keywords in metadata or extracted full-text(提取全文)
• Browsing though title, author, date or subject indices(主題索引), with optional image thumbnails
Search is an essential component of discovery in DSpace. Users' expectations from a search engine are quite high, so a goal for DSpace is to supply as many search features as possible. DSpace's indexing and search module has a very simple API which allows for indexing new content, regenerating the index, and performing searches on
the entire corpus, a community, or collection. Behind the API is the Java freeware search engine Lucene [http://
jakarta.apache.org/lucene/]. Lucene gives us fielded searching, stop word removal, stemming, and the ability to
incrementally add new indexed content without regenerating the entire index. The specific Lucene search indexes are
configurable enabling institutions to customize which DSpace metadata fields are indexed.
Another important mechanism for discovery in DSpace is the browse. This is the process whereby the user views a
particular index, such as the title index, and navigates around it in search of interesting items. The browse subsystem
provides a simple API for achieving this by allowing a caller to specify an index, and a subsection of that index.
The browse subsystem then discloses the portion of the index of interest. Indices that may be browsed are item title,
item issue date, item author, and subject terms. Additionally, the browse can be limited to items within a particular
collection or community.

2011年1月12日 星期三

關中等黨職併公職 宜蘭縣民提告

不實之退休年資
詐取公款退休金犯罪事實
侵占公有財物罪、利用職務上之機會詐取財物罪、圖利罪、刑法使公務員登載不實罪

參考來源: 自由電子報 - 關中等黨職併公職 宜蘭縣民提告 (在「Google 網頁註解」中檢視)