Metadata-Version: 2.1
Name: pycode2seq
Version: 0.0.3
Summary: Inference and training for multiple languages of code2seq
Home-page: https://github.com/kisate/pycode2seq
Author: Dmitrii Kharlapenko
Author-email: dimkakha@gmail.com
License: MIT
Download-URL: https://pypi.org/project/pycode2seq/
Keywords: code2seq,pytorch,pytorch-lightning,ml4code,ml4se
Platform: UNKNOWN
Description-Content-Type: text/markdown
Requires-Dist: torch (~=1.8.1)
Requires-Dist: torchtext (>=0.10.0)
Requires-Dist: pytorch-lightning (>=1.3.5)
Requires-Dist: code2seq (==0.0.2)
Requires-Dist: antlr4-python3-runtime (~=4.9.2)
Requires-Dist: setuptools (>=52.0.0)
Requires-Dist: tqdm (~=4.32.2)
Requires-Dist: numpy (>=1.20.1)
Requires-Dist: regex (>=2019.11.1)
Requires-Dist: omegaconf (~=2.0.6)
Requires-Dist: dataclasses (~=0.6)
Requires-Dist: requests (~=2.25.1)

# pycode2seq

Training and inference with multiple languages of PyTorch's implementation of code2seq model.

## Installation

```shell
pip install pycode2seq
```

## Inference

####File embeddings example

```python
from pycode2seq import Code2Seq

model = Code2Seq.load("kt_java")

# Dictionary of method names with their embeddings
method_embeddings = model.methods_embeddings("File.kt", "kt")
```

####Full functionality
```python
import sys
from pycode2seq import Code2Seq

def main(argv):
    model = Code2Seq.load("kt_java")

    # Dictionary of method names with their embeddings
    method_embeddings = model.methods_embeddings("File.kt", "kt") 

    #Code2seq predictions
    predictions = model.run_on_file(argv[1], "kt")

    #Predicted method names
    names = [model.prediction_to_text(prediction) for prediction in predictions]

if __name__ == "__main__":
    main(sys.argv)
```

## Training

Download astminer and run:

```shell
./gradelw shadowJar
```

Mine projects for paths:

```shell
python training/mine_projects.py <data folder> <output folder> <path to astminer's cli.sh>
```

Combine mined paths:

```shell
python training/astminer_to_code2seq.py <data folder/holdout> <output folder> <holdout>
```

Build vocabulary with build_vocabulary.py from code2seq module

Combine vocabularies:

```shell
python training/combine_vocabularies.py
```

Expand weights:

```shell
python training/expand_weights.py
```


