Parse

From CCIL
Revision as of 09:18, 14 May 2017 by Atanas.ilchev (Talk | contribs) (API)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

About

This component covers the ability of CCIL to read and understand at a a basic level various formats of inbound media. The API is extendable at a maximal degree. So far the supported formats are:

  • Plain Text
  • PDF

API

The central point is the Parser interface. Its main purpose is to decompose an InputStream to the features it contains.

The API implements the following ones:

Name Constant Description
Title Parser.TITLE The title of the media.
Content Parser.CONTENT The textual body of the media.

Services

  • TikaParserService