redi package

Submodules

redi.batch module

Functions related to the RediBatch database

redi.batch.BATCH_STATUS_COMPLETED = 'Completed'

@see #check_input_file()

The first time we run the app there is no SQLite file where to store the md5 sums of the input file. This function creates an empty RediBatch in the SQLite file specified as db_path

@return True if the database file was properly created with an empty table

redi.batch.add_batch_entry(db_path, md5)[source]

Inserts a row into RediBatch table @see #check_input_file() Parameters ———- db_path : string

The SQLite database file name
md5 : string
The md5 sum to be inserted
create_time : string
The batch start time
redi.batch.check_input_file(batch_warning_days, db_path, email_settings, raw_xml_file, project, start_time)[source]
redi.batch.create_empty_md5_database(db_path)[source]
redi.batch.create_empty_table(db_path)[source]
redi.batch.dict_factory(cursor, row)[source]
redi.batch.get_batch_by_id(db_path, batch_id)[source]
redi.batch.get_days_since_today(date_string)[source]

@return the number of days passed since the specified date

redi.batch.get_db_friendly_date()[source]

@return string in format: 2014-06-24

redi.batch.get_db_friendly_date_time()[source]

@return string in format: “2014-06-24 01:23:24”

redi.batch.get_last_batch(db_path)[source]
redi.batch.get_md5_input_file(input_file)[source]

@see #check_input_file() @see https://docs.python.org/2/library/hashlib.html @see https://docs.python.org/2/library/sqlite3.html#sqlite3.Connection.row_factory

Returns the md5 sum for the redi input file

redi.batch.printxml(tree)[source]

Helper function for debugging xml content

redi.batch.update_batch_entry(db_path, id, status, start_time, end_time)[source]

Update the status and the start/end time of a specified batch entry Return True if update succeeded, False otherwise

db_path : string id : integer status : string start_time : datetime string end_time : datetime string

redi.form module

class redi.form.Event(etree_node)[source]

Bases: object

field(name)[source]
fields()[source]
form_name
is_empty()[source]
name
study_id
class redi.form.Field(etree_node)[source]

Bases: object

clear_value()[source]
name
value
class redi.form.Form(data)[source]

Bases: object

events()[source]

redi.redi module

redi.py - Converter from raw clinical data in XML format to REDCap API data

Usage:
redi.py -h | –help redi.py [-v] [-V] [-k] [-e] [-d] [-r] [-c=<path>] [-D=<datadir>] [-s] [-b]
Options:

-h –help Show this help message and exit -v –verbose Increase output verbosity [default:False] -V –version Show version number [default:False] -k –keep Use this option to preserve the files

generated during execution [default:False]

-e –emrdata Use this option to get EMR data [default:False] -d –dryrun To execute redi.py in dry run state. This

is to be able to test each release by doing a dry run, where the data is fetched and processed but not transferred to the production REDCap. Email is also not sent. The processed data is stored as output files under the “out” folder under project root [default:False].
-r –resume WARNING!!! Resumes the last run. This
switch is for a specific case. Check the documentation before using it. [default:False]

-c –config-path=<path> Specify the path to the configuration directory -D –datadir=<datadir> Specify the path to the directory containing

project specific input and output data which will help in running multiple simultaneous instances of redi for different projects
-s –skip-blanks Skip blank events when sending data to REDCap
[default:False]
-b –bulk-send-blanks Send blank events in bulk instead of
individually [default:False]
class redi.redi.PersonFormEventsRepository(filename, logger=None)[source]

Bases: object

Wrapper for the person-form-events XML file

delete()[source]
fetch()[source]
store(pfe_tree)[source]
class redi.redi.SentEvents(filename, writer=None, reader=None)[source]

Bases: object

List of form events that have been sent to REDCap

Parameters:
  • filename – file location
  • writer – delegate called after an event has been marked sent
  • reader – function to read previously sent events from disk
mark_sent(study_id_key, form_name, event_name)[source]
was_sent(study_id_key, form_name, event_name)[source]
redi.redi.add_elements_to_tree(data)[source]

Add blank elements to fill out in ElementTree.

Add element to data ElementTree for timestamp, redcap form name, eventName, formDateField, and formCompletedFieldName.

Parameters:data – the input ElementTree from the parsed raw XML file.
redi.redi.compress_data_using_study_form_date(data)[source]

This function is removing duplicate results which were recorded on same date but different times. Warnings:

  • we assume that the passed ElementTree is sorted

  • we skip all “Canceled” results but we want to keep at least one

    so when all results are canceled we keep the first one

  • the passed object is altered

@see #get_key_date() @see #get_key_timestamp() @see #sort_element_tree()

data: the ElementTree object that needs to be compressed return: none

redi.redi.configure_logging(data_folder, verbose=False)[source]

Configures the Logger

redi.redi.connect_to_redcap(email_settings, redcap_settings, dry_run=False)[source]
redi.redi.convert_component_id_to_loinc_code(data, component_to_loinc_code_xml_tree)[source]

This function converts COMPONENT_ID in raw data to loinc_code based on the mapping provided in the xml file

Parameters:
  • data – Raw data xml tree
  • component_to_loinc_code_xml_tree – COMPONENT_ID to loinc_code mapping xml file tree.
redi.redi.convert_none_type_object_to_empty_string(my_object)[source]

replace noneType objects with an empty string. Else return the object.

redi.redi.copy_data_to_person_form_event_tree(raw_data_tree, person_form_event_tree, form_events_tree)[source]

This function copies data from the raw_data_tree to the person_form_event_tree

Parameters:
  • raw_data_tree – This parameter holds raw data tree
  • person_form_event_tree – This parameter holds person form event tree
  • form_events_tree – This parameter holds form events tree
redi.redi.create_empty_event_tree_for_study(raw_data_tree, all_form_events_tree)[source]

This function uses raw_data_tree and all_form_events_tree and creates a person_form_event_tree for study

Parameters:
  • raw_data_tree – This parameter holds raw data tree
  • all_form_events_tree – This parameter holds all form events tree
redi.redi.create_empty_events_for_one_subject(form_events_tree, translation_table_tree)[source]
redi.redi.create_empty_events_for_one_subject_helper(form_events_file, translation_table_file)[source]

This function creates new copies of the form_events_tree and translation_table_tree and calls create_empty_events_for_one_subject :param form_events_file: This parameter holds the path of form_events file :param translation_table_file: This parameter holds the path of translation_table file

redi.redi.get_db_path(batch_info_database, database_path)[source]
redi.redi.get_email_settings(settings)[source]

Helper function for grouping email-related properties

redi.redi.get_key_date(ele)[source]

Helper function for #compress_data_using_study_form_date()

elem: lxml.etree._Element object for which we build a key returns the corresponding quadruple (study_id, form_name, loinc_code, date)

redi.redi.get_key_timestamp(ele)[source]

Helper function for #sort_element_tree() @see #compress_data_using_study_form_date()

elem: lxml.etree._Element object for which we build a key returns the corresponding quadruple (study_id, form_name, timestamp)

redi.redi.get_redcap_settings(settings)[source]

Helper function for grouping redcap connection properties

redi.redi.load_preproc(preprocessors, root='./')[source]

Copied and modified version of load_rules function. TODO: fix load_rules and load_prerules for better parallelism

redi.redi.load_rules(rules, root='./')[source]

Load custom post-processing rules.

Rules should be added to the configuration file under a property called “rules”, which has key-value pairs mapping a unique rule name to a Python file. Each Python file intended to be used as a rules file should have a run_rules() function which takes one argument.

Example config.json:
{ “rules”: { “my_rules”: “rules/my_rules.py” } }
Example rules file:
def run_rules(data):
pass
redi.redi.main()[source]

Data processing steps:

  • parse raw XML to ElementTree: “data”

  • call read-in function to load xml into ElementTree

  • parse formEvents.xml to ElementTree

  • call read-in function to load xml into ElementTree

  • parse translationTable.xml to ElementTree

  • call read-in function to load xml into ElementTree

  • add element to data ElementTree for timestamp, redcap form name,

    eventName, formDateField, and formCompletedFieldName

  • write out ElementTree as an XML file

  • call read-in function to load xml into ElementTree

  • update timestamp using collection_date and collection_time

  • write redcapForm name to data ElementTree by a lookup of component ID in translationTable.xml

  • sort data by: study_id, form name, then timestamp, ascending order

  • write formDateField to data ElementTree via lookup of formName in formEvents.xml

  • write formCompletedFieldName to data ElementTree via lookup of formName in formEvents.xml

  • write eventName to data ElementTree via lookup of formName in formEvents.xml

Example: <formName value=”chemistry”>

<event name=”1_arm_1” />

</formName>

  • write the Final ElementTree to EAV
redi.redi.parse_form_events(form_events_file)[source]

Parse the form_events file into an ElementTree

Parameters:form_events_file – the name of the input file (from the json configuration)
Returns:ElementTree
redi.redi.parse_raw_xml(raw_xml_file)[source]

Generate an ElementTree from a raw XML file.

Parameters:raw_xml_file – the input file.
Returns:parsed XML data
redi.redi.parse_translation_table(translation_table_file)[source]

Parse the translationTable.xml into an ElementTree

Parameters:translation_table_file – the name of the input file
Returns:ElementTree
redi.redi.read_config(config_file, configuration_directory, file_list)[source]

Check if files mentioned in configuration files exist

redi.redi.replace_fields_in_raw_xml(data, fields_to_replace_xml)[source]

replace_fields_in_raw_xml: This function renames all fields which need renaming.Fields which need renaming are read from the xml file. Parameters:

data: Raw data xml tree fields_to_replace_xml: Path to xml file which has list of fields which need renaming.
redi.redi.research_id_to_redcap_id_converter(data, redcap_client, research_id_to_redcap_id, configuration_directory)[source]
This function converts the research_id to redcap_id
  1. prepare a dictionary with [key, value] –> [study_id, redcap_id]

2. replace the element tree study_id with the new redcap_id’s for each bad id, log it as warn.

Example of xml fragment produced:

<subject lab_id=”999-0001”>
<NAME>HEMOGLOBIN</NAME> <loinc_code>1534435</loinc_code> <RESULT>1234</RESULT>
...
<STUDY_ID>1</STUDY_ID> <!– originally this was “999-0001” –>

</subject>

Note: The next function which reads the “data” tree
is #create_empty_event_tree_for_study()
redi.redi.run_preproc(preprocessors, settings)[source]
redi.redi.run_rules(rules, person_form_event_tree_with_data)[source]
redi.redi.setStat(event, translation_table_dict, translation_table_status_field_text_list)[source]

Ruchi Vivek Desai, May 13 2014 to assist the updateStatusFieldValueInPersonFormEventTree function

redi.redi.set_status_for(field_name, event, translation_table_dict)[source]

Ruchi

redi.redi.sort_element_tree(data, data_folder)[source]

Sort element tree based on three given indices. @see #update_time_stamp()

Keyword argument: data sorting is based on study_id, form name, then timestamp, ascending order

redi.redi.updateStatusFieldValueInPersonFormEventTree(person_form_event_tree, translational_table_tree)[source]

Ruchi Vivek Desai, May 13 2014 This function updates the status field value with either NOT_DONE (value in the translation table) or empty string based on certain conditions

redi.redi.update_data_from_lookup(data, element_to_set_in_data, index_element_in_data, lookup_data, element_to_find_in_lookup_data, index_element_in_lookup_data, value_in_lookup_data, undefined)[source]
Update a single field in an element tree based on a lookup in another
element tree
Parameters:
  • data – an element tree with a field that needs to be set
  • element_to_set_in_data – element that will be set
  • index_element_in_data – element in data that wil be looked up in lookup table where value of element to be set wil be found
  • lookup_data – an element tree that contains, the lookup data
  • element_to_find_in_lookup_data – parameter for the initial findall in the lookup data
  • index_element_in_lookup_data – the element in the lookup data that will be the key in the lookup table
  • value_in_lookup_data – element in the lookup data that provides the value in the lookup table
  • undefined – a string to be returned for all failed lookups in the lookup table
redi.redi.update_event_name(data, lookup_data, undefined)[source]

function to update eventName to data ElementTree via lookup of formName in formEvents ElementTree

redi.redi.update_form_imported_field(data, lookup_data, undefined)[source]

Update the formImportedFieldName value for all subjects

redi.redi.update_formcompletedfieldname(data, lookup_data, undefined)[source]

function to update formCompletedFieldName in data ElementTree via lookup of formName in formEvents ElementTree

redi.redi.update_formdatefield(data, form_events_tree)[source]

Write formDateField to data ElementTree via lookup of formName in form_events_tree ElementTree

redi.redi.update_recap_form_status(data, lookup_data, undefined)[source]

Update the redcapStatusFieldName value to all subjects

redi.redi.update_redcap_field_name_value_and_units(data, lookup_data, undefined)[source]

function to update redcapFieldNameValue and redcapFieldNameUnits in data ElementTree via lookup of redcapFieldNameValue and redcapFieldNameUnits in translation table tree

redi.redi.update_redcap_form(data, lookup_data, undefined)[source]

Lookup component ID in translationTable to get the redcapFormName. Write the redcapForm name to data If component lookup fails, sets formName to undefinedForm

redi.redi.update_time_stamp(data, input_date_format, output_date_format)[source]

Update timestamp using input and output data formats. Warnings:

  • we modify the data ElementTree
  • we affect the sorting order of data elements @see #sort_element_tree()
redi.redi.validate_xml_file_and_extract_data(xmlfilename, xsdfilename)[source]

This function is responsible for validating xml file against an xsd and to extract data from xml if validation succeeds

Parameters:
  • xmlfilename – This parameter holds the path to the xml file
  • xsdfilename – This parameter holds the path to the xsd file
redi.redi.verify_and_correct_collection_date(data, input_date_format)[source]
redi.redi.write_element_tree_to_file(element_tree, file_name)[source]

Write an ElementTree to a file whose name is provided as an argument

redi.redi_lib module

redi.report module

class redi.report.ReportCourier[source]

Bases: object

deliver(report)[source]
class redi.report.ReportCreator(report_file_path, project_name, redcap_uri, sort_by_lab_id, writer)[source]

Bases: object

create_report(report_data, alert_summary, collection_date_summary_dict, duration_dict)[source]
format_seconds_as_string(seconds)[source]
Convert seconds to a friendly strings
3662 ==> ‘01:01:02’ 89662 ==> ‘1 day, 0:54:22’
seconds : integer
The number of seconds to be converted
get_time_diff(end, start)[source]

Get time difference in seconds from the two dates Parameters ———- end : string

The end timestamp
start : string
The start timestamp
class redi.report.ReportEmailSender(settings, logger)[source]

Bases: redi.report.ReportCourier

deliver(report)[source]

Deliver summary report as an email

:email_settings dictinary with email parameters :html the actual report content

class redi.report.ReportFileWriter(output_file, logger)[source]

Bases: redi.report.ReportCourier

deliver(report)[source]

Deliver the summary report by writing it to a file or logging it to the console if writing the file fails

:html_report_path the path where the report will be stored :html the actual report content

redi.report.gen_ele(ele_name, ele_text)[source]

Create an xml element with given name and content

redi.report.gen_subele(parent, subele_name, subele_text)[source]
redi.report.updateReportAlerts(root, alert_summary)[source]
redi.report.updateReportErrors(root, errors)[source]
redi.report.updateReportHeader(root, report_parameters)[source]

Update the passed root element tree with date, project name and url

redi.report.updateReportSummary(root, report_data)[source]
redi.report.updateSubjectDetails(root, subject_details)[source]

Helper method for #create_summary_report() Adds subject information to the xml tree which is later formated by redi/utils/report.xsl into the html table#subject_details”

redi.report.updateSummaryOfSpecimenTakenTimes(root, collection_date_summary_dict)[source]

redi.upload module

Functions related to uploading data to REDCap

redi.upload.create_import_data_json(import_data_dict, event_tree)[source]

Convert data from event_tree to json format.

@TODO: evaluate performance @see the caller {@link #redi.upload.generate_output()}

Param:import_data_dict: holds the event tree data
Param:event_tree: holds the event tree data
Return type:dict

:return the json version of the xml data

redi.upload.create_redcap_records(import_data)[source]

Creates REDCap records from RED-I’s form data, AKA import data.

REDCap API only accepts records for importing. Records are differentiated by their unique record ID, unless the REDCap Project is a Longitudinal study. In that case, they are differentiated by a combination of record ID and an event.

Since RED-I views the world in terms of forms, we have to project our form-centric view into REDCap’s record-centric world. This is done by combining all form data with the same Subject ID and Event Name into the same record.

Parameters:import_data – iterable of 4-tuples: (study_id_key, form_name, event_name, json_data_dict)
Returns:iterable of REDCap records ready for upload
redi.upload.generate_output(person_tree, redcap_client, rate_limit, sent_events, max_retry_count, skip_blanks=False, bulk_send_blanks=False)[source]

Note: This function communicates with the redcap application. Steps:

  • loop for each person/form/event element
  • generate a csv fragment using create_eav_output
  • send csv fragment to REDCap using send_eav_data_to_redcap

@see the caller {@link #redi.redi._run()}

Return type:dictionary
Returns:the report_data which is passed to the report rendering function
redi.upload.handle_errors_in_redcap_xml_response(study_id, redcap_err, report_data)[source]

Checks for any errors in the redcap response and update report data if there are any errors.

redcap_err: RedcapError object report_data: dictionary to which we store error details

Module contents