Language and Culture Archive of Ashkenazic Jewry Digital Archive User Guide: Workspace at LCAAJResearch

This is a guide to using the Digital Archive to the Language and Culture Atlas of Ashkenazic Jewry (dlc.library.columbia.edu/lcaaj)

Links to LCAAJ Spreadsheets

To request a specific set of data from the printouts, please email lcaaj@libraries.cul.columbia.edu.  We will post links to data by request and as available.

Tools for GIS mapping

Columbia Libraries' Digital Social Science Center has a guide to using various mapping tools that might be useful in creating a map of LCAAJ data.

Some recommended tools include:

  • ARCGIS
  • QGIS
  • Carto
  • Google Earth

Workspace at LCAAJResearch

The primary aim of the LCAAJ data is to support the creation of maps.  A variety of softwares support this kind of activity in a digital format -- GoogleEarth, QGIS, Carto, ArcGIS, and others.  All of them allow the import of data in spreadsheet or tab- or comma-delimited format, owith the possibility of filtering according to various attributes of that data. 

The data digitized and placed online in the ColumbiaDigital Library collection is not in machine-readable format.  Much of it is handwritten.  Researchers wanting to map the answers to a given question or question will ultimately need to transcribe that information for each of the informants into some kind of spreadsheet or to check it against some OCRed data already available there (see below), where it can be associated with geographic coordinates and other relevant metadata. 

The workspace provided by LCAAJResearch is designed to support such work in a collaborative fashion. Individuals interested in working on specific questions can request us to load Google spreadsheets here, containing on the OCRed (but only partially corrected) printouts produced by the NEH project, along with space for additional entries based on data in Answer Sheets and Blue Books not contained inthe printouts.  (To request the data for a specific question, contact us at lcaaj@libraries.cul.columbia.edu. We will be make these spreadsheets on an as-requested basis in order to take advantage over time of increased opportunities for cleaning the OCR and enriching the metadata in advance. 

It is our hope that this space can foster collaboration that can help the scholarly community to avoid duplication effort and move as quickly as possible to the creation of a large collection of machine-readable maps for comparision and analysis.  We hope as well, that this site, along with the impressive resources already available at EYDES, will foster increased dialog about and use of this important collection, eventually enabling the addition of further metadata and mapping tools.

The spreadsheets will contain all of the question and interviewee metadata fields relevant to that answer, but most of the fields will initally be hidden. To avoid unintentional corruption, only a few fields will be editable. 

The initially displayed fields will be:

Basic data (read-only)

Question Number: Combination of page and question number as described elsewhere in this document.
Interviewee Number: Number identifying the geographic locations for the interviewees.
Answer: Either an OCred answer (sometimes just blank space if the software failed to read correctly) or an indication that this is a placeholder template.

Editable Fields:

Transcribed Answer: This is where most of your work will be concentrated, either by copying a correctly OCRed answer in the Answer field or simply inputting it from the online digitized images of PrintOuts, Answer Sheets, or Blue Books
Transcribed Answer only: The actual answer, separated from the accompanying notations
Transcribed Notation: Notation only, without the answer.

The initially hidden fields:

Editable fields

Transcribed LocREF:
Transcribed R,:
Transcribed *:
Transcribed Question Number:
Transcribed Interviewee Number:

Read-only fields

LOCREF: Reference to other Interviewee in Answer:
R: Notes Presence of reference to other interviewee (Y) or to linguistic topic (/) in answer
*: Indication that answer continues on next record line
PO Number: Number of the page in the printouts where this answer is found
QP Number: Page number of the questionnaire
Printout Series:
Box and File Number: For records taken from printouts
PO Sort Number:  Sequence Number from original printouts, enabling users who are reading through a given printout to move more systematically through the data by sorting by Box and File Number and 
Survey: SMQ or WY
Question in Yiddish:
Question in Transliteration:
Linguistic Topic Numbers Addressed by Question:
Ethnographic Table of Contents Entry of Question:
Latitude of Interviewee Location:
Longitude of Inteviewee Location:
Country of Interviewee:
LCAAJ Region of Interviewee:

Interviewer ID:

We plan to include a list of the questions currently transcribed or already available for use by others.  During the transcription process, we will can reserve access to the editable field to the requesting researcher only.  We also recognize that some researchers, engaged in dissertation or publication research, may not be ready to share their research, and we are are happy to find ways of restricting read-only access to them alone. 

LCAAJResearch will be an evolving project, and we expect its features, procedures, and perhaps even venue, to change as the research community has had an opportunity to work here.