Metadata-Version: 2.4
Name: abcupload
Version: 1.1.0
Summary: ABCOMICS Genomic Data Upload Tool — upload sequencing files to ABCOMICS
Author-email: Khadim Gueye <khadim.gueye@abcomics.org>
License: MIT
Project-URL: Homepage, https://abcomics.org
Keywords: bioinformatics,genomics,upload,ABCOMICS,fastq,bam,vcf
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Utilities
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.28
Requires-Dist: rich>=13.0
Dynamic: license-file

# abcupload

**ABCOMICS Genomic Data Upload Tool**

A command-line tool to upload sequencing and genomic data files to the
[ABCOMICS](https://abcomics.org) platform. It handles the full upload process
automatically: requesting a signed URL from the API, then streaming each file
directly to secure cloud storage with a live progress bar.

---

## Installation

```bash
pip install abcupload
```

Requires Python 3.8 or later. Dependencies (`requests`, `rich`) are installed
automatically.

---

## Quick start — manifest mode (recommended)

After submitting your runs and analyses metadata in the ABCOMICS dashboard,
download the **upload manifest** (a TSV file) for your project. The manifest
maps each local filename to its ID-based destination name in ABCOMICS storage.

```bash
# Set your API key once (contact contact@abcomics.org to receive it)
export ABCOMICS_API_KEY=your_key_here

# Upload all files listed in the manifest
abcupload -d /path/to/your/files/ -m PRJAB00001_upload_manifest.tsv

# Overwrite without asking if a file already exists
abcupload -d /path/to/your/files/ -m PRJAB00001_upload_manifest.tsv -o
```

The manifest is a tab-separated file with four columns:

```
# entity_id    original_filename       new_gcs_filename            project_id
RUNAB00001      sample_R1.fastq.gz      RUNAB00001_1.fastq.gz       PRJAB00001
RUNAB00001      sample_R2.fastq.gz      RUNAB00001_2.fastq.gz       PRJAB00001
RUNAB00002      reads.bam               RUNAB00002.bam              PRJAB00001
ANZAB00001      result.tsv              ANZAB00001.tsv              PRJAB00001
```

`abcupload` reads the manifest to rename files to their ABCOMICS IDs on the
way up — **your local files are never renamed**.

---

## Legacy mode

```bash
# Upload a single file
abcupload -u abc-000001 -p PRJAB00001 -d sample_R1.fastq.gz

# Upload all FASTQ files in the current directory
abcupload -u abc-000001 -p PRJAB00001 -d '*.fastq.gz'

# Upload an entire directory (scans for all accepted formats)
abcupload -u abc-000001 -p PRJAB00001 -d /data/project/

# Test mode — files are deleted after 24 hours
abcupload -u abc-000001 -p PRJAB00001 -d '*.fastq.gz' -t
```

---

## Full usage

```
manifest mode (recommended):
  abcupload -d DIRECTORY -m MANIFEST_TSV [options]

  -d DIRECTORY      Directory containing the files to upload
  -m MANIFEST_TSV   TSV manifest downloaded from the ABCOMICS dashboard

legacy mode:
  abcupload -u USERNAME -p PROJECT_ID -d FILES [options]

  -u USERNAME       ABCOMICS username          (e.g. abc-000001)
  -p PROJECT_ID     Project ID                 (e.g. PRJAB00001)
  -d FILES          Single file, comma list, glob pattern, or directory

options:
  -k API_KEY        API key  (alternative to ABCOMICS_API_KEY env var)
  -t                Test mode — files deleted after 24 h
  -o                Overwrite existing files without asking
  -h, --help        Show this help and exit
  -V, --version     Show version and exit
```

---

## Environment variable

```bash
# Add to ~/.bashrc or ~/.zshrc for persistence
export ABCOMICS_API_KEY=your_key_here
```

---

## Accepted file formats

| Format           | Extensions |
|------------------|------------|
| FASTQ            | `.fastq`, `.fq`, `.fastq.gz`, `.fq.gz`, `.fastq.bz2`, `.fq.bz2` |
| FASTA            | `.fasta`, `.fa`, `.fna`, `.faa`, `.ffn`, `.frn`, `.fasta.gz`, `.fa.gz`, `.fna.gz`, `.faa.gz` |
| BAM / SAM / CRAM | `.bam`, `.sam`, `.cram` |
| Index files      | `.bai`, `.crai`, `.csi`, `.tbi` |
| VCF / BCF        | `.vcf`, `.vcf.gz`, `.bcf`, `.bcf.gz` |
| Annotation       | `.gff`, `.gff3`, `.gtf`, `.bed` (and `.gz` variants) |
| Tabular          | `.csv`, `.tsv` |

Maximum file size: **500 GB**.

---

## Changelog

### 1.1.0
- **New manifest mode** (`-d DIR -m manifest.tsv`): reads an ABCOMICS upload manifest to rename files to their ID-based GCS names on upload without touching local files.
- Files now upload directly to `public/{project_id}/` in GCS; no private→public move step needed.
- `-u` (username) and `-p` (project_id) are no longer required in manifest mode; all info comes from the TSV.
- Legacy mode (`-u USER -p PRJ -d files`) still fully supported.

### 1.0.1
- Initial public release.

---

## Author

**Khadim Gueye** — African Bioinformatics Center (ABCOMICS)  
Contact: [contact@abcomics.org](mailto:contact@abcomics.org)

---

## License

[MIT](LICENSE)
