Difference between revisions of "Technology"

From Noolaham Foundation
Jump to navigation Jump to search
m
Line 1: Line 1:
 
__NOTOC__
 
__NOTOC__
{{Technology}}
 
 
'''Technology''' is an integral component of many programs undertaken by the Noolaham Foundation.  We use and build upon a range of technologies including scanning technologies, archiving and digital library software, educational software and collaboration tools.  
 
'''Technology''' is an integral component of many programs undertaken by the Noolaham Foundation.  We use and build upon a range of technologies including scanning technologies, archiving and digital library software, educational software and collaboration tools.  
  
 
The '''Technology Team''' is responsible for the strategic technology planning and infrastructure implementation for all of Noolaham's operations.  Documentation/data collection, scanning, online library, project management and co-ordination--all require cost effective software and hardware infrastructure and support.
 
The '''Technology Team''' is responsible for the strategic technology planning and infrastructure implementation for all of Noolaham's operations.  Documentation/data collection, scanning, online library, project management and co-ordination--all require cost effective software and hardware infrastructure and support.
  
 +
== Technical Support & Maintenance ==
 +
=== Customization, Hosting and Maintenance of MediaWiki ===
 +
Currently Noolaham's digital library uses MediaWiki as the digital library software. It is being hosted pro bono by webspace2host.in; in a LAMP environment. There are three main components to the digital library: content (pdf documents, html documents and image files), database containing the metadata, and the website.  Data extraction from the templates, and implementation of semantic wiki are currently being undertaken. 
  
== Key Activities ==  
+
=== Backup Restoration Plan ===
Key activities for the Technology Team include:
+
The content (pdfs, html files and image files), database and websites are backed in four geographical locations in three continents on a regular basis.  This is to ensure protection from physical or other threats, and continuation of the work.
  
* '''Customization, hosting, maintenance and backup of the Noolaham digital library.'''
+
=== Software, Hardware and Logistics Support ===
Currently Noolaham's digital library uses [http://www.mediawiki.org/wiki/MediaWiki MediaWiki] as the digital library software.  It is being hosted ''pro bono'' by '''[http://webspace2host.in webspace2host.in]'''; in a LAMP environment.  There are three main components to the digital library: content (pdf documents and image files), database containing the metadata, and the website. Everything is being backed up in different locations on a regular basis.
+
Many of Noolaham's projects are IT driven. The Technology Team fulfills the IT needs of the Foundation in several ways: undertaking software development and customization, building collaboration and project management tools, providing scanning solutions and other activities.  
  
 +
=== Scanning Technologies ===
 +
Noolaham’s technical teams provides guidance with regards to scanners, scanning methods and process to be implemented in the Noolaham’s Scanning Centers.  There are manual, semi-automatic, and automatic scanners. Currently we mostly employ manual scanners, which demand high labour. In the short term, we are planning to employ semi-automatic scanners and automatic scanners to achieve greater efficiency while maintaining the quality.
  
* '''Software, hardware and logistics support for various Noolaham Foundation's projects and activities.'''
+
== Preservation and Digital Archiving ==
Many of Noolaham's projects are IT driven. The Technology Team fulfills the IT needs of the Foundation in several ways: undertaking software development and customization, building collaboration and project management tools, providing scanning solutions and other activities.
+
=== Digital Library Software ===
 +
Considering our growing needs we have began the process of transitioning into an advanced digital library software. Currently we are evaluating various software solutions for implementation. Multimedia, meta-data management, browsing and searching, social sharing and community features, multilingual support, work flow, access control, scalability, extensibility, maintenance, costs and development community are key considerations.  
  
 +
=== Multimedia Archiving ===
 +
Increasingly images, video, audio and multimedia are used to share and communicate ideas.  Digital technologies have enabled the documentation of tacit knowledge and intangible cultural heritage easier.  In the first five years, Noolaham has focused on documenting text based records such as books, magazines and newspapers.  With the Multimedia Arching project we are building a collaborative platform using Drupal CMS driven Mukurtu.
  
* '''Guidance with regards to scanners, scanning technologies and processes to be implemented in the Noolaham scanning centers.'''  
+
=== Web Archiving ===
There are manual, semi-automatic, and automatic scannersCurrently we mostly employ manual scanners which demand high labour.  In the short term, we are planning to employ semi-automatic scanners and automatic scanners to achieve greater efficiency while maintaining the quality.
+
Tamil and Tamils related content on web is temporal.  Due to hosting costs, lack of maintenance or changing technologies web content gets lostInspired by Internet Archive, Noolaham’s Web Archiving project aims to archive informative and culturally important Tamil web content for posterityThe project involves building an intelligent vertical web crawler and a system to store, organize, search and retrieve web assets.
  
 +
=== Dataset Development ===
 +
Open data is critical for our society.  Information about places, organizations, people, books, movies, media, arts, crafts etc need to be accessible to all and shared in a standard way.  We work with other organizations in creating standards, and in collecting and disseminating data sets.
  
* '''Research, reporting and implementation of the technical Digital Preservation Standards for the Noolaham Foundation.'''
+
== Data and Information Architecture ==
Devising and implementing a well-structured and robust metadata standard is necessary for proper storage and classification of content in order to enable search, retrieval, and sharing. We create, maintain, and share descriptive, structural and administrative metadata in accordance with international standards such as Dublin Core and Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). We are still evolving the metadata standards to suit the needs of our content.
+
=== Digital Library (cross disciplinary team: Program Team) ===
 +
Applying Information Architecture principles and practices for digital libraries improves archiving findability and usability. From enabling access, browsing, metadata to classification technology team works closely with the program team to design and implement the information architecture.
  
 +
=== Virtual Learning Environment ===
 +
The goal of the Virtual Class Room is to provide students open courseware for self-directed learning as well as to assist in their regular school work. This project uses Moodle learning management software. This project is in its early stage of development.
  
* '''Technology, analysis and support for the design and implementation of multi-lingual and multimedia digital library software.'''
+
== Tamil Natural Language Processing ==
Considering our growing needs we have began the process of transitioning into an advanced digital library software.  Currently we are evaluating various software solutions for implementationMultimedia, meta-data management, browsing and searching, social sharing and community features, multilingual support, work flow, access control, scalability, extensibility, maintenance, costs and development community are key considerations.  
+
=== Tamil Language Model ===
 +
An open source Tamil Language Model is necessary to advance Tamil parser, OCR, machine translation, speech recognition and related natural language processing technologies.  These technologies will enable Noolaham’s and other Tamil collections to be searched, translated and converted into audioThis is a long term research project.
  
 
+
=== Tamil OCR ===
* '''Provide risk assessment of the digital collections supported by the Noolaham  foundation.'''
+
Noolaham Foundation has identified Tamil optical character recognition (OCR) and machine translation as high priority technologies that will greatly assist with the Foundation's work. We work with university research groups primarily in the following areas: requirements gathering, benchmark preparation and data selection.
Digital resources are easy to store, share and preserve, yet technological obsolescence is a major concern.  Discontinuation of particular software or hardware support can render digital resources morbid.  Noolaham Foundations plans for the long term by digitizing the resources in high quality using well supported formats and technologies and we continually monitor and put in migration plans when necessary. 
 
 
 
 
 
* '''Support research and development of language technologies such as OCR.'''
 
Noolaham Foundation has identified Tamil optical character recognition (OCR) and machine translation as high priority technologies that will greatly assist with the Foundation's work. We work with university research groups primarily in the following areas: requirements gathering, benchmark preparation and data selection.
 
 
 
 
 
* '''Establishing Virtual Class Room for supporting educational needs of students.'''
 
The goal of the Virtual Class Room is to provide students [http://en.wikipedia.org/wiki/OpenCourseWare open courseware] for self directed learning as well as to assist in their regular school work.  This project uses [http://moodle.org/ Moodle] learning management software. This project is in its early stage of development.
 
 
 
 
 
* '''Building a semi-automated web indexer and archive for Sri Lankan Tamil related content.'''
 
Primarily undertaken by undergraduate students with guidance from professors, this project aims to build on open source web archiving tools and use them to index and archive Sri Lankan Tamil related content.
 

Revision as of 17:21, 25 November 2012

Technology is an integral component of many programs undertaken by the Noolaham Foundation. We use and build upon a range of technologies including scanning technologies, archiving and digital library software, educational software and collaboration tools.

The Technology Team is responsible for the strategic technology planning and infrastructure implementation for all of Noolaham's operations. Documentation/data collection, scanning, online library, project management and co-ordination--all require cost effective software and hardware infrastructure and support.

Technical Support & Maintenance

Customization, Hosting and Maintenance of MediaWiki

Currently Noolaham's digital library uses MediaWiki as the digital library software. It is being hosted pro bono by webspace2host.in; in a LAMP environment. There are three main components to the digital library: content (pdf documents, html documents and image files), database containing the metadata, and the website. Data extraction from the templates, and implementation of semantic wiki are currently being undertaken.

Backup Restoration Plan

The content (pdfs, html files and image files), database and websites are backed in four geographical locations in three continents on a regular basis. This is to ensure protection from physical or other threats, and continuation of the work.

Software, Hardware and Logistics Support

Many of Noolaham's projects are IT driven. The Technology Team fulfills the IT needs of the Foundation in several ways: undertaking software development and customization, building collaboration and project management tools, providing scanning solutions and other activities.

Scanning Technologies

Noolaham’s technical teams provides guidance with regards to scanners, scanning methods and process to be implemented in the Noolaham’s Scanning Centers. There are manual, semi-automatic, and automatic scanners. Currently we mostly employ manual scanners, which demand high labour. In the short term, we are planning to employ semi-automatic scanners and automatic scanners to achieve greater efficiency while maintaining the quality.

Preservation and Digital Archiving

Digital Library Software

Considering our growing needs we have began the process of transitioning into an advanced digital library software. Currently we are evaluating various software solutions for implementation. Multimedia, meta-data management, browsing and searching, social sharing and community features, multilingual support, work flow, access control, scalability, extensibility, maintenance, costs and development community are key considerations.

Multimedia Archiving

Increasingly images, video, audio and multimedia are used to share and communicate ideas. Digital technologies have enabled the documentation of tacit knowledge and intangible cultural heritage easier. In the first five years, Noolaham has focused on documenting text based records such as books, magazines and newspapers. With the Multimedia Arching project we are building a collaborative platform using Drupal CMS driven Mukurtu.

Web Archiving

Tamil and Tamils related content on web is temporal. Due to hosting costs, lack of maintenance or changing technologies web content gets lost. Inspired by Internet Archive, Noolaham’s Web Archiving project aims to archive informative and culturally important Tamil web content for posterity. The project involves building an intelligent vertical web crawler and a system to store, organize, search and retrieve web assets.

Dataset Development

Open data is critical for our society. Information about places, organizations, people, books, movies, media, arts, crafts etc need to be accessible to all and shared in a standard way. We work with other organizations in creating standards, and in collecting and disseminating data sets.

Data and Information Architecture

Digital Library (cross disciplinary team: Program Team)

Applying Information Architecture principles and practices for digital libraries improves archiving findability and usability. From enabling access, browsing, metadata to classification technology team works closely with the program team to design and implement the information architecture.

Virtual Learning Environment

The goal of the Virtual Class Room is to provide students open courseware for self-directed learning as well as to assist in their regular school work. This project uses Moodle learning management software. This project is in its early stage of development.

Tamil Natural Language Processing

Tamil Language Model

An open source Tamil Language Model is necessary to advance Tamil parser, OCR, machine translation, speech recognition and related natural language processing technologies. These technologies will enable Noolaham’s and other Tamil collections to be searched, translated and converted into audio. This is a long term research project.

Tamil OCR

Noolaham Foundation has identified Tamil optical character recognition (OCR) and machine translation as high priority technologies that will greatly assist with the Foundation's work. We work with university research groups primarily in the following areas: requirements gathering, benchmark preparation and data selection.