Metadata-Version: 2.4
Name: autorml
Version: 0.1.0
Summary: AutoRML: A framework for automatic RML mapping generation using semantic table annotations
Home-page: https://github.com/dtai-kg/AutoRML
Author: Ioannis Dasoulas
Author-email: ioannis.dasoulas@kuleuven.be
License: Apache-2.0
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Natural Language :: English
Requires-Python: >=3.9, <3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: yatter==1.1.4
Requires-Dist: morph-kgc==2.8.0
Requires-Dist: requests==2.32.2
Requires-Dist: pandas==2.2.1
Requires-Dist: pyyaml==6.0.2
Requires-Dist: torchic_tab_heuristic==0.1.0
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# AutoRML

[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) 
[![Python Versions](https://img.shields.io/badge/Python-3.9%20|%203.10%20|%203.11-blue.svg)](https://www.python.org/)

**AutoRML** is a system that automatically generates declarative mappings ([YARRRML](https://rml.io/yarrrml/) and [RML](https://rml.io)) and constructs RDF knowledge graphs for tabular data. It does so by utilizing third-party semantic table annotation systems, mapping produced semantic annotations to declarative mapping rules. 

AutoRML can be used with different semantic table annotation systems. It has been tested with [MTab](https://mtab.kgraph.jp) and [TorchicTab](https://ceur-ws.org/Vol-3557/paper2.pdf). UPDATE: MTab has dropped support. Only TorchicTab is currently available.

![System Overview](resources/system.jpg)



## Installation 

Python versions 3.9, 3.10, and 3.11 are tested and recommended. In case of conflicts, create a new virtual environment. For example, if you use conda, run:

    ```bash
    conda create -n autorml_env python=3.11
    ```
    ```bash
    conda activate -n autorml_env
    ```

AutoRML installation:

    ```bash
    pip install autorml
    ```


## Usage

```
Usage: 
python -m autorml <options>

Options:
-h,--help                           Show options information
-i, --input_table                   Input table in CSV format
-sta, --sta_system                  Used semantic table annotation system (default: torchictab)
-m, --materialize                   Enable RDF materialization 
-sa, --save_annotations             Save semantic table annotation system results
-af, --annotations_folder           Output annotations folder (default: annotations)
-oa, --annotations_output           Output annotations file (default: annotations.json)
-mf, --mappings_folder              Output mappings folder (default: mappings)
-oy, --yarrml_output                Output YARRRML mappings file (default: mappings.yml)
-or, --rml_output                   Output RML mappings (default: mappings.rml.ttl)
-rf, --rdf_folder                   Output RDF folder (default: rdf)
-okg, --rdf_output                  Output RDF file (default: kg.nt)
-ds, --delete_sem                   Delete supporting semantically enhanced table after termination
```



## Example usage 

You can find some tables to test AutoRML in the `examples` folder. To run AutoRML for one of the examples, go to the repository root and run:

    python -m autorml -i "examples/tables/cities.csv" -m -sa -af "examples/annotations" -mf "examples/mappings" -rf "examples/rdf" -ds

After AutoRML terminates, explore the `examples` folder to see the results of AutoRML's automated declarative knowledge graph construction. The `examples/annotations` folder will include the semantic annotations generated by the used semantic table annotation system. The `examples/mappings` folder will include the human-readable YARRRML and the RML declarative mappings produced by AutoRML. The `examples/rdf` folder will include the RDF knowledge graph automatically generated from the input data source, utilizing AutoRML's RML mappings. You can explore and query the generated graph, as well as search for its relationships and entities in Wikidata, TorchicTab's target knowledge base to semantically annotate the tables. 

The open-source version of AutoRML currently only supports TorchicTab as a semantic table annotation system, since MTab is no longer available. AutoRML is designed to support different semantic annotation systems, so you can also try hosting your own or contact AutoRML authors to test your tables. 



## Evaluations

To see AutoRML results for different dataset collections and semantic table annotation systems, unzip and explore `evaluations/evaluations_data.zip` or visit our dedicated AutoRML Evaluations Zenodo repository: (TODO add Zenodo link when we submit the paper. Currently the Zenodo repository exists but is private). The evaluation files contain semantic annotations, mappings and RDF knowledge graphs automatically generated by AutoRML, using two different semantic table annotations systems: [MTab](https://mtab.kgraph.jp) and [TorchicTab](https://ceur-ws.org/Vol-3557/paper2.pdf), and five differnet collections of CSV tables. 

Annotation accuracy:
![Annotation accuracy](resources/accuracy.png)

Annotation automation:
![Annotation automation](resources/automation.png)



## Contact

Ioannis Dasoulas: ioannis.dasoulas@kuleuven.be 

Anastasia Dimou: anastasia.dimou@kuleuven.be
