How to Tell a File's Format: Five Open Source Tools Udemy

You will learn how to use five free, open-source tools to identify the format, version, and profile of document files and obtain their metadata. If you're working in library and archive technology, or if you're a student preparing for this career, the course will give you a strong start in using those tools and understanding their strengths and weaknesses. The five central sections each cover one of these tools:

file: A command line tool included in Linux and Unix for simple file identification.

DROID: A batch-oriented tool from the UK National Archives, using the PRONOM format registry.

ExifTool: A metadata extraction tool that recognizes a broad range of formats.

JHOVE: Software developed at the Harvard University Library for careful validation of certain formats. I wrote most of the code for JHOVE.

Apache Tika: Content extraction software which can identify many formats.

For each tool, there's a discussion of how to use it followed by an on-screen demonstration of installing and using it, as well as a downloadable PDF summarizing the material.

You should be comfortable with installing software on your computer. Familiarity with the Unix/Linux command line is strongly recommended. Most, but not all, of the tools described can run on Windows. All will run on a Macintosh or Linux system.

This course is no longer available.

How to Tell a File's Format: Five Open Source Tools Udemy
Price: 25 USD

    Course details

    You will learn how to use five free, open-source tools to identify the format, version, and profile of document files and obtain their metadata. If you're working in library and archive technology, or if you're a student preparing for this career, the course will give you a strong start in using those tools and understanding their strengths and weaknesses. The five central sections each cover one of these tools:

    file: A command line tool included in Linux and Unix for simple file identification.

    DROID: A batch-oriented tool from the UK National Archives, using the PRONOM format registry.

    ExifTool: A metadata extraction tool that recognizes a broad range of formats.

    JHOVE: Software developed at the Harvard University Library for careful validation of certain formats. I wrote most of the code for JHOVE.

    Apache Tika: Content extraction software which can identify many formats.

    For each tool, there's a discussion of how to use it followed by an on-screen demonstration of installing and using it, as well as a downloadable PDF summarizing the material.

    You should be comfortable with installing software on your computer. Familiarity with the Unix/Linux command line is strongly recommended. Most, but not all, of the tools described can run on Windows. All will run on a Macintosh or Linux system.

    Updated on 30 December, 2017
    Courses you can instantly connect with... Do an online course on IT, Computing and Technology starting now. See all courses

    Is this the right course for you?

    Rate this page

    Didn't find what you were looking for ?

    or