Data Concordance

Overview

Colectica allows you to describe how variables measure the same information at different times or among different populations.

Metadata Structure for Variables

Colectica uses three levels of items to describe how variables from different points in time or different datasets correspond to each other.

Item Type

Description

Variable

A column in a dataset

Represented Variable

Describes how a variable is measured; the data
type. This may be consistent across rounds, or may change.

Conceptual Variable

Describes a measurement of a person, firm, or other thing,
without specifying the data type. The most
generic way to describe something that is measured.

Consider a dataset that measures marital status in three different years: 2000, 2005, and 2010. The dataset may look like:

ID

marstat2000

marstat2005

marstat2010

1

Married

Divorced

Married

2

Divorced

Divorced

Divorced

3

Married

Married

Widowed

In the first two years, the data was represented by two choices: Married and Divorced. In the third year, the data contain a new option: Married, Divorced, and Widowed. Two different representation types are used for this variable over time.

This dataset can be documented with three variables (aside from the ID): marstat2000, marstat2005, marstat2010. Since a variable corresponds to a single column in a single data file, three variables are necessary.

Since there are two representation types, we will use two represented variables; let’s call them marstat and marstat-plus.

Finally, a conceptual variable is used to describe the common information among the three variables. This can be named marstat, and should be referenced by both the represented variables.

The following diagram visualizes these items and their relationships.

../../../_images/marstat-diagram.png

Variable Concordance Views in Colectica Portal

By specifying the variables in this way, Colectica Portal is able to create concordance views that show a comparison the variables.

../../../_images/portal-concordance-for-variables.png

For coded variables, Portal can also show a comparison of the codes used for each variable.

../../../_images/portal-concordance-code-comparison.png

Metadata Structure Used to Define Concordance

In Colectica Portal, the Explore page shows concordance tables that allow users to browse for variables by topic across many datasets. To enable this functionality, information can be created and published following the structured described here.

See also

For information on configuring the concordance tables in Colectica Portal, see /portal/technical/deployment/configuration.

Organizing by variable sets

In addition to supporting concordance tables as described above, concordance tables can also be built based on an additional metadata structure. This allows concordance tables to be built based on VariableGroups, instead of requiring PhysicalInstances to exist. The concorded items can be stored in VariableGroups in ResourcePackages that exist either under a Group or a StudyUnit.

  • Series or Study
    • Metadata Package (1..1)
      • Concept Set (1..1)
        • Concept (0..n)

      • Conceptual Variable Set (1..1)
        • Conceptual Variable Group (container group) (1..1)
          • Conceptual Variable Group (0..n)

      • Variable Set
        • Variable Group (0..n)
          • Variable (0..n)

    • Study (0..n)
      • Metadata Package (0..n)
        • Variable Scheme
          • Variable Group (0..n)
            • Variable (0..n)

The following logic will be used to build these concordance tables.

  • One table is built per ConceptualVariableGroup (same as existing)

  • Each ConceptualVariable in the group gets a row (same as existing)

  • All Variables that reference the ConceptualVariable are gathered

  • For each gathered Variable: - All VariableGroups and VariableSchemes that reference the Variable are gathered

  • One column is created for each distinct gathered VariableGroup/Scheme

  • The content of each ConceptualVariable(row)-to-VariableGroup cell is filled with links to Variable that reference the ConceptualVariable, and that are contained in the VariableGroup

Concordance Tables in Colectica Portal

With metadata in the structure described above, Colectica Portal will display concordance tables similar to the following:

../../../_images/concordance-table.png

Note

Statistical comparisons are only available when using the PhysicalInstance approach.

Describe Concordance in Colectica Designer

To assign a represented variable to a variable:

  1. Navigate to the variable editor for the variable.

    ../../../_images/variable-editor.png
  2. On the Concept tab, search for a represented variable to assign, or create a new one.

    ../../../_images/variable-concept-tab.png

    See also

    For instructions on assigning or creating the represented variable, see Reference a Single Item.

  3. Assign the same represented variable to any other variables, as appropriate.

  4. Drill into the represented variable.

  5. Using the represented variable’s editor, assign or create a conceptual variable.

  6. After assigning represented and conceptual variables, you can view the list of all referenced variables by using the Links view from the represented variable.

    ../../../_images/represented-variable-links.png