changeset 0:5d4c00d5e84e draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/main/data_managers/data_manager_build_coreprofiler commit ed6d188035ac31aa2da132889141b0898aa6bc86
author iuc
date Fri, 21 Nov 2025 13:11:30 +0000
parents
children c1251aa70a03
files README.rst data_manager/data_manager_build_coreprofiler_download.xml data_manager_conf.xml test-data/coreprofiler_scheme.loc.test tool-data/coreprofiler_scheme.loc.sample tool_data_table_conf.xml.sample tool_data_table_conf.xml.test
diffstat 7 files changed, 447 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/README.rst	Fri Nov 21 13:11:30 2025 +0000
@@ -0,0 +1,90 @@
+This tool downloads and builds the **CoreProfiler** scheme.
+-------------------------------------------------------------------------------
+
+You can find the list of available schemes, as well as the reference platforms supported by CoreProfiler, in the
+`CoreProfiler documentation <https://gitlab.com/ifb-elixirfr/abromics/coreprofiler/-/blob/main/README.md?ref_type=heads#basic-usage>`_.
+
+Please refer to this page for details on how to use the tool and which schema options are available.
+
+Use Galaxy's data manager framework to download and install new CoreProfiler schemes.
+
+If you want to use a scheme from **EnteroBase**, you do not need to provide any token or secret.
+
+However, if you want to use a scheme from **pubMLST** or **BigsDB**, you will need to follow a procedure before launching the data manager.
+
+BIGSdb and PubMLST platforms require **OAuth1 authentication** to access and download the most up-to-date schemes.
+While authentication is not strictly mandatory, skipping it may result in downloading outdated schemes.
+
+This authentication involves two types of tokens:
+
+* **Consumer tokens**: permanent tokens used to initiate the authentication flow.
+* **Access tokens**: tokens required to download a scheme.
+
+Procedure for **pubMLST schemes** (example: ``borrelia_3-cgMLST-639-pubmlst``)
+-------------------------------------------------------------------------------
+
+1. Create an account on the `pubMLST website <https://pubmlst.org/bigsdb>`_.
+2. Generate a consumer token and secret from your account settings  
+   (**My account → API keys → Enter key name → Submit**).
+3. On your account page, go to **Database registrations**, check all databases, and register.
+4. Download `coreprofiler <https://gitlab.com/ifb-elixirfr/abromics/coreprofiler>`_ locally and run the following command to obtain your access token and secret:
+
+   .. code-block:: bash
+
+      coreprofiler db get_request_tokens --scheme <SCHEME_NAME> \
+         --consumer_key <YOUR_CONSUMER_TOKEN> \
+         --consumer_secret <YOUR_CONSUMER_SECRET>
+
+   Replace the placeholders with your scheme of interest (example: ``borrelia_3``) and your actual consumer token and secret.
+
+   This command will provide you with a URL to visit in order to authorize the client software to access your account.  
+   After authorizing, it will give you a verification code that you need to enter in the command line prompt.  
+   It will then return your access token and secret.
+
+5. Provide the consumer token, consumer secret, access token, and access secret in the data manager tool  
+   by setting these bash variables in a ``.txt`` file:
+
+   .. code-block:: bash
+
+      export COREPROFILER_CONSUMER_TOKEN="<YOUR_CONSUMER_TOKEN>"
+      export COREPROFILER_CONSUMER_SECRET="<YOUR_CONSUMER_SECRET>"
+      export COREPROFILER_ACCESS_TOKEN="<YOUR_ACCESS_TOKEN>"
+      export COREPROFILER_ACCESS_SECRET="<YOUR_ACCESS_SECRET>"
+
+6. Set the path to this ``.txt`` file in your environment by making an environment variable  
+   (example: ``export COREPROFILER_SECRETS_PATH="/path/to/your/secret_file.txt"``).
+
+
+Procedure for **BigsDB schemes** (example: ``bordetella_1-cgMLST_genus-1415-BIGSdb``)
+--------------------------------------------------------------------------------------
+
+1. Create an account on the `BigsDB website <https://bigsdb.pasteur.fr/cgi-bin/bigsdb/bigsdb.pl?page=registration>`_.
+2. Ask for a consumer token and secret by sending an email to ``bigsdb@pasteur.fr``  
+   (subject: **API client key**).
+3. On your account page, go to **Database registrations**, check all databases, and register.
+4. Download `coreprofiler <https://gitlab.com/ifb-elixirfr/abromics/coreprofiler>`_ locally and run the following command to obtain your access token and secret:
+
+   .. code-block:: bash
+
+      coreprofiler db get_request_tokens --scheme <SCHEME_NAME> \
+         --consumer_key <YOUR_CONSUMER_TOKEN> \
+         --consumer_secret <YOUR_CONSUMER_SECRET>
+
+   Replace the placeholders with your scheme of interest (example: ``bordetella_1``) and your actual consumer token and secret.
+
+   This command will provide you with a URL to visit in order to authorize the client software to access your account.  
+   After authorizing, it will give you a verification code that you need to enter in the command line prompt.  
+   It will then return your access token and secret.
+
+5. Provide the consumer token, consumer secret, access token, and access secret in the data manager tool  
+   by setting these bash variables in a ``.txt`` file:
+
+   .. code-block:: bash
+
+      export COREPROFILER_CONSUMER_TOKEN="<YOUR_CONSUMER_TOKEN>"
+      export COREPROFILER_CONSUMER_SECRET="<YOUR_CONSUMER_SECRET>"
+      export COREPROFILER_ACCESS_TOKEN="<YOUR_ACCESS_TOKEN>"
+      export COREPROFILER_ACCESS_SECRET="<YOUR_ACCESS_SECRET>"
+
+6. Set the path to this ``.txt`` file in your environment by making an environment variable  
+   (example: ``export COREPROFILER_SECRETS_PATH="/path/to/your/secret_file.txt"``).
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/data_manager/data_manager_build_coreprofiler_download.xml	Fri Nov 21 13:11:30 2025 +0000
@@ -0,0 +1,296 @@
+<tool id="data_manager_build_coreprofiler" name="Download and build CoreProfiler scheme" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" tool_type="manage_data" profile="@PROFILE@">
+    <description></description>
+    <macros>
+        <token name="@TOOL_VERSION@">1.1.7</token>
+        <token name="@VERSION_SUFFIX@">0</token>
+        <token name="@PROFILE@">22.05</token>
+        <xml name="version_command">
+            <version_command><![CDATA[$ coreprofiler --version]]></version_command>
+        </xml>
+        <xml name="biotools">
+            <xrefs>
+                <xref type="bio.tools">coreprofiler</xref>
+                <xref type="bio.tools">blast</xref>
+            </xrefs>
+        </xml>
+        <xml name="element_assert" token_name="" token_text="">
+            <element name="@NAME@">
+                <assert_contents>
+                    <has_text text="@TEXT@"/>
+                    <yield/>
+                </assert_contents>
+            </element>
+        </xml>
+    </macros>
+    <requirements>
+        <requirement type="package" version="@TOOL_VERSION@">coreprofiler</requirement>
+        <requirement type="package" version="2.16.0">blast</requirement>
+    </requirements>
+    <command detect_errors="exit_code"><![CDATA[
+#set $scheme_name = str($coreprofiler_scheme_select).split('-')[0]
+#set $scheme_db = str($coreprofiler_scheme_select).split('-')[3]
+
+#set $db_version = str($db_version)
+
+mkdir -p '$out_file.extra_files_path' &&
+mkdir -p coreprofiler_${scheme_name}_${db_version}/scheme_$scheme_name/ &&
+#if $db_version == "token_version"
+        source "$COREPROFILER_SECRETS_PATH" &&
+        if [[ -z "$COREPROFILER_CONSUMER_TOKEN" || -z "$COREPROFILER_CONSUMER_SECRET" || -z "$COREPROFILER_ACCESS_TOKEN" || -z "$COREPROFILER_ACCESS_SECRET" ]]; then
+            echo "Error: Missing required bash variables for CoreProfiler authentication." >&2
+            echo "Please set the following variables before running this tool:" >&2
+            echo "  COREPROFILER_CONSUMER_TOKEN" >&2
+            echo "  COREPROFILER_CONSUMER_SECRET" >&2
+            echo "  COREPROFILER_ACCESS_TOKEN" >&2
+            echo "  COREPROFILER_ACCESS_SECRET" >&2
+            echo "Please refer to the data manager help section for more information on how to generate these tokens." >&2
+        fi &&
+#end if
+
+    coreprofiler db download 
+        -s $scheme_name
+        -o coreprofiler_${scheme_name}_${db_version}/scheme_$scheme_name
+        ## Used only for test 
+        #if str($hide_test) == 'true':
+            -t
+        #end if
+        ##
+#if $db_version == "token_version"
+        -k "$COREPROFILER_CONSUMER_TOKEN"
+        -ks "$COREPROFILER_CONSUMER_SECRET"
+        -a "$COREPROFILER_ACCESS_TOKEN"
+        -as "$COREPROFILER_ACCESS_SECRET"
+#end if
+&&
+coreprofiler db makeblastdb
+    -s coreprofiler_${scheme_name}_${db_version}/scheme_$scheme_name 
+    -n $scheme_name 
+    -p coreprofiler_${scheme_name}_${db_version}/db_$scheme_name &&
+
+mv coreprofiler_${scheme_name}_${db_version} '$out_file.extra_files_path' &&
+
+cp '$dmjson' '$out_file'
+    ]]></command>
+    <configfiles>
+        <configfile name="dmjson"><![CDATA[
+#from datetime import date
+
+#set $scheme_name = str($coreprofiler_scheme_select).split('-')[0]
+#set $scheme_desc = str($coreprofiler_scheme_select).split('-')[1]
+#set $scheme_loci = str($coreprofiler_scheme_select).split('-')[2]
+#set $scheme_db = str($coreprofiler_scheme_select).split('-')[3]
+
+#set $db_version = str($db_version)
+
+{
+    "data_tables":{
+    "coreprofiler_scheme":[
+    {
+        "value": "coreprofiler_downloaded_#echo date.today().strftime('%d%m%Y')#-${scheme_name}-${scheme_desc}-${scheme_loci}-${scheme_db}-${db_version}",
+        "name": "${scheme_name}: ${scheme_desc} [${scheme_loci} loci] (${scheme_db}-${db_version})",
+        "path": "coreprofiler_${scheme_name}_${db_version}",
+        "database": "coreprofiler_${scheme_name}_${db_version}/db_${scheme_name}/${scheme_name}.fasta",
+        "scheme": "coreprofiler_${scheme_name}_${db_version}/scheme_${scheme_name}"
+    }
+    ]
+}
+        }]]></configfile>
+    </configfiles>
+    <inputs>
+            <param name="db_version" type="select" label="Select if you want latest version from pubMLST or BigsDB using tokens" help="If you choose latest, you will have to provide tokens and secrets to access the database following the procedure describe in the documentation." >
+                <option value="token" selected="true">Latest version from pubMLST or BigsDB, you will need to set bash variables for tokens and secrets as described in the documentation.</option>
+                <option value="no_token">No need of tokens but it will download the previous version, not the latest version, from pubMLST or BigsDB or if you want a database from Enterobase</option>
+            </param>
+        <!-- used only for tests, limit download scheme to 50 locus -->
+        <param name="hide_test" type="hidden" value=""/>
+        <!-- -->
+        <param name="coreprofiler_scheme_select" type="select" label="CoreProfiler available scheme" help="Choose a schema from a reference platform supported in CoreProfiler">                   
+            <option value="bordetella_1-cgMLST_genus-1415-BIGSdb">bordetella_1: cgMLST_genus [1415 loci] (BIGSdb)</option>
+            <option value="bordetella_4-cgMLST_pertussis-2038-BIGSdb">bordetella_4: cgMLST_pertussis [2038 loci] (BIGSdb)</option>
+            <option value="diphtheria_1-cgMLST-1305-BIGSdb">diphtheria_1: cgMLST [1305 loci] (BIGSdb)</option>
+            <option value="diphtheria_5-cgMLST_ulcerans-1628-BIGSdb">diphtheria_5: cgMLST_ulcerans [1628 loci] (BIGSdb)</option>
+            <option value="klebsiella_15-cgMLST_KpI-2537-BIGSdb">klebsiella_15: cgMLST_KpI [2537 loci] (BIGSdb)</option>
+            <option value="klebsiella_10-cgMLST_ST258_ST512_ST1199-1371-BIGSdb">klebsiella_10: cgMLST_ST258_ST512_ST1199 [1371 loci] (BIGSdb)</option>
+            <option value="klebsiella_18-scgMLST629_S-629-BIGSdb">klebsiella_18: scgMLST629_S [629 loci] (BIGSdb)</option>
+            <option value="klebsiella_3-scgMLST634-632-BIGSdb">klebsiella_3: scgMLST634 [632 loci] (BIGSdb)</option>
+            <option value="leptospira_3-capture_cgMLST-545-BIGSdb">leptospira_3: capture_cgMLST [545 loci] (BIGSdb)</option>
+            <option value="leptospira_1-cgMLST-545-BIGSdb">leptospira_1: cgMLST [545 loci] (BIGSdb)</option>
+            <option value="listeria_3-cgMLST1748-1748-BIGSdb">listeria_3: cgMLST1748 [1748 loci] (BIGSdb)</option>
+            <option value="yersinia_2-Y.enterocolitica_cgMLST-1727-BIGSdb">yersinia_2: Y.enterocolitica cgMLST [1727 loci] (BIGSdb)</option>
+            <option value="yersinia_1-Yersinia_cgMLST-500-BIGSdb">yersinia_1: Yersinia cgMLST [500 loci] (BIGSdb)</option>
+            <option value="yersinia_3-Y.pseudotuberculosis_cgMLST-1921-BIGSdb">yersinia_3: Y.pseudotuberculosis cgMLST [1921 loci] (BIGSdb)</option>
+            <option value="abaumannii_3-cgMLST_v1-2133-pubmlst">abaumannii_3: cgMLST v1 [2133 loci] (pubmlst)</option>
+            <option value="bcereus_2-B._anthracis_cgMLST-3803-pubmlst">bcereus_2: B. anthracis cgMLST [3803 loci] (pubmlst)</option>
+            <option value="bcereus_5-B._cereus_cgMLST-1568-pubmlst">bcereus_5: B. cereus cgMLST [1568 loci] (pubmlst)</option>
+            <option value="borrelia_3-cgMLST-639-pubmlst">borrelia_3: cgMLST [639 loci] (pubmlst)</option>
+            <option value="brucella_3-cgMLST-1764-pubmlst">brucella_3: cgMLST [1764 loci] (pubmlst)</option>
+            <option value="bmallei_1-cgMLST-3311-pubmlst">bmallei_1: cgMLST [3311 loci] (pubmlst)</option>
+            <option value="bpseudomallei_2-cgMLST-4090-pubmlst">bpseudomallei_2: cgMLST [4090 loci] (pubmlst)</option>
+            <option value="campylobacter_4-C._jejuni_/_C._coli_cgMLST_v1-1343-pubmlst">campylobacter_4: C. jejuni / C. coli cgMLST v1 [1343 loci] (pubmlst)</option>
+            <option value="campylobacter_8-C._jejuni_/_C._coli_cgMLST_v2-1142-pubmlst">campylobacter_8: C. jejuni / C. coli cgMLST v2 [1142 loci] (pubmlst)</option>
+            <option value="chlamydiales_44-C._abortus_cgMLST_v1.0-959-pubmlst">chlamydiales_44: C. abortus cgMLST v1.0 [959 loci] (pubmlst)</option>
+            <option value="chlamydiales_42-C._trachomatis_cgMLST_v1.0-817-pubmlst">chlamydiales_42: C. trachomatis cgMLST v1.0 [817 loci] (pubmlst)</option>
+            <option value="cchauvoei_1-cgMLST-2223-pubmlst">cchauvoei_1: cgMLST [2223 loci] (pubmlst)</option>
+            <option value="cperfringens_2-cgMLST-1431-pubmlst">cperfringens_2: cgMLST [1431 loci] (pubmlst)</option>
+            <option value="dnodosus_3-cgMLST-714-pubmlst">dnodosus_3: cgMLST [714 loci] (pubmlst)</option>
+            <option value="escherichia_v1-cgMLST-2513-enterobase">escherichia_v1: cgMLST v1 [2513 loci] (enterobase)</option>
+            <option value="hinfluenzae_56-cgMLST_v1-1037-pubmlst">hinfluenzae_56: cgMLST v1 [1037 loci] (pubmlst)</option>
+            <option value="leptospira_4-cgMLST-1565-pubmlst">leptospira_4: cgMLST [1565 loci] (pubmlst)</option>
+            <option value="mabscessus_2-cgMLST-2904-pubmlst">mabscessus_2: cgMLST [2904 loci] (pubmlst)</option>
+            <option value="neisseria_72-Human_restricted_Neisseria_cgMLST-v1.0-1441-pubmlst">neisseria_72: Human-restricted Neisseria cgMLST v1.0 [1441 loci] (pubmlst)</option>
+            <option value="neisseria_45-L3_cgMLST-1742-pubmlst">neisseria_45: L3 cgMLST [1742 loci] (pubmlst)</option>
+            <option value="neisseria_68-L44_cgMLST-1699-pubmlst">neisseria_68: L44 cgMLST [1699 loci] (pubmlst)</option>
+            <option value="neisseria_62-N._gonorrhoeae_cgMLST_v1.0-1649-pubmlst">neisseria_62: N. gonorrhoeae cgMLST v1.0 [1649 loci] (pubmlst)</option>
+            <option value="neisseria_89-N._gonorrhoeae_cgMLST_v2-1430-pubmlst">neisseria_89: N. gonorrhoeae cgMLST v2 [1430 loci] (pubmlst)</option>
+            <option value="neisseria_47-N._meningitidis_cgMLST_v1-1605-pubmlst">neisseria_47: N. meningitidis cgMLST v1 [1605 loci] (pubmlst)</option>
+            <option value="neisseria_85-N._meningitidis_cgMLST_v2-1422-pubmlst">neisseria_85: N. meningitidis cgMLST v2 [1422 loci] (pubmlst)</option>
+            <option value="neisseria_88-N._meningitidis_cgMLST_v3-1329-pubmlst">neisseria_88: N. meningitidis cgMLST v3 [1329 loci] (pubmlst)</option>
+            <option value="pmultocida_3-cgMLST_draft_1233-pubmlst">pmultocida_3: cgMLST (draft) [1233 loci] (pubmlst)</option>
+            <option value="salmonella_v2-cgMLST-3002-enterobase">salmonella_v2: cgMLST v2 [3002 loci] (enterobase)</option>
+            <option value="salmonella_3-SalmcgMLST_v1.0-2750-pubmlst">salmonella_3: SalmcgMLST v1.0 [2750 loci] (pubmlst)</option>
+            <option value="serratia_2-cgMLST-2692-pubmlst">serratia_2: cgMLST [2692 loci] (pubmlst)</option>
+            <option value="saureus_20-cgMLST-1716-pubmlst">saureus_20: cgMLST [1716 loci] (pubmlst)</option>
+            <option value="sagalactiae_38-h_S.agalactiae_cgMLST_v1.0-1405-pubmlst">sagalactiae_38: h_S.agalactiae cgMLST v1.0 [1405 loci] (pubmlst)</option>
+            <option value="spneumoniae_2-cgMLST-1222-pubmlst">spneumoniae_2: cgMLST [1222 loci] (pubmlst)</option>
+            <option value="suberis_8-cgMLST-1447-pubmlst">suberis_8: cgMLST [1447 loci] (pubmlst)</option>
+            <option value="vcholerae_3-cgMLST-2443-pubmlst">vcholerae_3: cgMLST [2443 loci] (pubmlst)</option>
+            <option value="vparahaemolyticus_3-cgMLST-2254-pubmlst">vparahaemolyticus_3: cgMLST [2254 loci] (pubmlst)</option>
+            <option value="xcitri_1-cgMLST-1618-pubmlst">xcitri_1: cgMLST [1618 loci] (pubmlst)</option>
+        </param>
+    </inputs>
+    <outputs>
+        <data name="out_file" format="data_manager_json" label="${tool.name}"/>
+    </outputs>
+    <tests>
+        <test expect_num_outputs="1">
+            <param name="hide_test" value="true"/>
+            <param name="coreprofiler_scheme_select" value="borrelia_3-cgMLST-639-pubmlst" />
+            <param name="db_version" value="no_token"/>
+            <output name="out_file">
+                <assert_contents>
+                    <has_text text='"coreprofiler_scheme":'/>
+                    <has_text_matching expression='"value": "coreprofiler_downloaded_[0-9]{8}-borrelia_3-cgMLST-639-pubmlst-no_token"'/>
+                    <has_text text='"name": "borrelia_3: cgMLST [639 loci] (pubmlst-no_token)"'/>
+                    <has_text text='"path": "coreprofiler_borrelia_3_no_token"'/>
+                    <has_text text='"database": "coreprofiler_borrelia_3_no_token/db_borrelia_3/borrelia_3.fasta"'/>
+                    <has_text text='"scheme": "coreprofiler_borrelia_3_no_token/scheme_borrelia_3"'/>
+                </assert_contents>
+            </output>
+        </test>
+        <test expect_num_outputs="1">
+            <param name="hide_test" value="true"/>
+            <param name="coreprofiler_scheme_select" value="yersinia_1-Yersinia_cgMLST-500-BIGSdb" />
+            <param name="db_version" value="no_token"/>
+            <output name="out_file">
+                <assert_contents>
+                    <has_text text='"coreprofiler_scheme":'/>
+                    <has_text_matching expression='"value": "coreprofiler_downloaded_[0-9]{8}-yersinia_1-Yersinia_cgMLST-500-BIGSdb-no_token"'/>
+                    <has_text text='"name": "yersinia_1: Yersinia_cgMLST [500 loci] (BIGSdb-no_token)"'/>
+                    <has_text text='"path": "coreprofiler_yersinia_1_no_token"'/>
+                    <has_text text='"database": "coreprofiler_yersinia_1_no_token/db_yersinia_1/yersinia_1.fasta"'/>
+                    <has_text text='"scheme": "coreprofiler_yersinia_1_no_token/scheme_yersinia_1"'/>
+                </assert_contents>
+            </output>
+        </test>
+        <test expect_num_outputs="1">
+            <param name="hide_test" value="true"/>
+            <param name="coreprofiler_scheme_select" value="escherichia_v1-cgMLST-2513-enterobase" />
+            <param name="db_version" value="no_token"/>
+            <output name="out_file">
+                <assert_contents>
+                    <has_text text='"coreprofiler_scheme":'/>
+                    <has_text_matching expression='"value": "coreprofiler_downloaded_[0-9]{8}-escherichia_v1-cgMLST-2513-enterobase-no_token"'/>
+                    <has_text text='"name": "escherichia_v1: cgMLST [2513 loci] (enterobase-no_token)"'/>
+                    <has_text text='"path": "coreprofiler_escherichia_v1_no_token"'/>
+                    <has_text text='"database": "coreprofiler_escherichia_v1_no_token/db_escherichia_v1/escherichia_v1.fasta"'/>
+                    <has_text text='"scheme": "coreprofiler_escherichia_v1_no_token/scheme_escherichia_v1"'/>
+                </assert_contents>
+            </output>
+        </test>
+    </tests>
+    <help><![CDATA[
+This tool downloads and builds the **CoreProfiler** scheme.
+
+You can find the list of available scheme, as well as the reference platforms supported by CoreProfiler, in the <a href="https://gitlab.com/ifb-elixirfr/abromics/coreprofiler/-/blob/main/README.md?ref_type=heads#basic-usage" target="_blank">CoreProfiler documentation</a>.
+
+Please refer to this page for details on how to use the tool and which schema options are available.
+
+Use Galaxy's data manager framework to download and install new CoreProfiler schemes :
+
+If you want to use a scheme from enterobase, you do not need to provide any token or secret.
+
+However, if you want to use a scheme from **pubMLST** or **BigsDB**, you will need to follow a procedure before launching the data manager.
+
+BIGSdb and PubMLST platforms require OAuth1 authentication to access and download the most up-to-date schemes.
+While authentication is not strictly mandatory, skipping it may result in downloading outdated schemes.
+
+This authentication involves two types of tokens:
+
+ * Consumer tokens : permanent tokens used to initiate the authentication flow.
+
+ * Access tokens : tokens required to download a scheme.
+
+Procedure for pubMLST schemes (example with borrelia_3-cgMLST-639-pubmlst) :
+
+1. Create an account on the <a href="https://pubmlst.org/bigsdb" target="_blank">pubMLST</a> website.
+2. Generate a consumer token and secret from your account settings (My account > API keys > Enter key name > Submit).
+3. On your account page, go to Database registrations, check all databases and register.
+4. Download <a href="https://gitlab.com/ifb-elixirfr/abromics/coreprofiler" target="_blank">coreprofiler</a> locally and run the following command to obtain your access token and secret
+
+Replace the placeholders with your scheme of interest (example : borrelia_3) your actual consumer token and secret :
+
+"""
+coreprofiler db get_request_tokens --scheme <SCHEME_NAME> --consumer_key <YOUR_CONSUMER_TOKEN> --consumer_secret <YOUR_CONSUMER_SECRET>
+"""
+
+This command will provide you with a URL to visit in order to authorize client software to access your account. 
+
+After authorizing, it will give you a verification code that you need to enter in the command line prompt.
+
+It will then return your access token and secret.
+
+5. Provide the consumer token, consumer secret, access token, and access secret in the data manager tool by setting this bash variables in a txt file : 
+
+"""
+export COREPROFILER_CONSUMER_TOKEN="<YOUR_CONSUMER_TOKEN>"
+export COREPROFILER_CONSUMER_SECRET="<YOUR_CONSUMER_SECRET>"
+export COREPROFILER_ACCESS_TOKEN="<YOUR_ACCESS_TOKEN>"
+export COREPROFILER_ACCESS_SECRET="<YOUR_ACCESS_SECRET>"
+"""
+
+6. Set the path to this txt file in your environment by making an environment variable (example : export COREPROFILER_SECRETS_PATH="/path/to/your/secret_file.txt").
+
+Procedure for BigsDB (example with bordetella_1-cgMLST_genus-1415-BIGSdb):
+
+1. Create an account on the <a href="https://bigsdb.pasteur.fr/cgi-bin/bigsdb/bigsdb.pl?page=registration" target="_blank">BigsDB</a> website.
+2. Ask for a consumer token and secret by sending a mail to bigsdb@pasteur.fr (object "API client key").
+3. On your account page, go to Database registrations, check all databases and register.
+4. Download <a href="https://gitlab.com/ifb-elixirfr/abromics/coreprofiler" target="_blank">coreprofiler</a> locally and run the following command to obtain your access token and secret
+
+Replace the placeholders with your scheme of interest (example : bordetella_1) your actual consumer token and secret :
+
+"""
+coreprofiler db get_request_tokens --scheme <SCHEME_NAME> --consumer_key <YOUR_CONSUMER_TOKEN> --consumer_secret <YOUR_CONSUMER_SECRET>
+"""
+
+This command will provide you with a URL to visit in order to authorize client software to access your account. 
+
+After authorizing, it will give you a verification code that you need to enter in the command line prompt.
+
+It will then return your access token and secret.
+
+5. Provide the consumer token, consumer secret, access token, and access secret in the data manager tool by setting this bash variables in a txt file : 
+
+"""
+export COREPROFILER_CONSUMER_TOKEN="<YOUR_CONSUMER_TOKEN>"
+export COREPROFILER_CONSUMER_SECRET="<YOUR_CONSUMER_SECRET>"
+export COREPROFILER_ACCESS_TOKEN="<YOUR_ACCESS_TOKEN>"
+export COREPROFILER_ACCESS_SECRET="<YOUR_ACCESS_SECRET>"
+"""
+
+6. Set the path to this txt file in your environment by making an environment variable (example : export COREPROFILER_SECRETS_PATH="/path/to/your/secret_file.txt").
+    ]]></help>
+    <citations>
+        <citation type="doi">10.3390/microorganisms10020292</citation>
+    </citations>
+</tool>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/data_manager_conf.xml	Fri Nov 21 13:11:30 2025 +0000
@@ -0,0 +1,21 @@
+<?xml version="1.0"?>
+<data_managers>
+    <data_manager tool_file="data_manager/data_manager_build_coreprofiler_download.xml" id="data_manager_build_coreprofiler_download">
+        <data_table name="coreprofiler_scheme">  <!-- Defines a Data Table to be modified. -->
+            <output> <!-- Handle the output of the Data Manager Tool -->
+                <column name="value"/>  <!-- columns that are going to be specified by the Data Manager Tool -->
+                <column name="name"/>  <!-- columns that are going to be specified by the Data Manager Tool -->
+                <column name="path" output_ref="out_file">
+                    <move type="directory">
+                        <source>${path}</source>
+                        <target base="${GALAXY_DATA_MANAGER_DATA_PATH}">coreprofiler/${path}</target>
+                    </move>
+                    <value_translation>${GALAXY_DATA_MANAGER_DATA_PATH}/coreprofiler/${path}</value_translation>
+                    <value_translation type="function">abspath</value_translation>
+                </column>
+                <column name="database"/> <!-- columns that are going to be specified by the Data Manager Tool -->
+                <column name="scheme"/> <!-- columns that are going to be specified by the Data Manager Tool -->
+            </output>
+        </data_table>
+    </data_manager>
+</data_managers>
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/test-data/coreprofiler_scheme.loc.test	Fri Nov 21 13:11:30 2025 +0000
@@ -0,0 +1,18 @@
+#This is a tab separated file describing the location of CoreProfiler scheme 
+#used for the CoreProfiler tool
+#
+#file has this format (white space characters are TAB characters)
+#
+#The columns are:
+#value	name	path	database	scheme
+#
+#For example
+#coreprofiler_downloaded_20250625_klebsiella_3_scgMLST634_632_loci_bigsdb	klebsiella_3: scgMLST634 [632 loci] (BIGSdb)	coreprofiler_klebsiella_3	coreprofiler_klebsiella_3/db_klebsiella_3/klebsiella_3.fasta	coreprofiler_klebsiella_3/scheme_klebsiella_3
+coreprofiler_downloaded_10102025-borrelia_3-cgMLST-639-pubmlst-no_token	borrelia_3: cgMLST [639 loci] (pubmlst-no_token)	/tmp/tmphn3k3ekk/galaxy-dev/tool-data/coreprofiler/coreprofiler_borrelia_3_no_token	coreprofiler_borrelia_3_no_token/db_borrelia_3/borrelia_3.fasta	coreprofiler_borrelia_3_no_token/scheme_borrelia_3
+coreprofiler_downloaded_10102025-bordetella_1-cgMLST_genus-1415-BIGSdb-no_token	bordetella_1: cgMLST_genus [1415 loci] (BIGSdb-no_token)	/tmp/tmphn3k3ekk/galaxy-dev/tool-data/coreprofiler/coreprofiler_bordetella_1_no_token	coreprofiler_bordetella_1_no_token/db_bordetella_1/bordetella_1.fasta	coreprofiler_bordetella_1_no_token/scheme_bordetella_1
+coreprofiler_downloaded_28102025-yersinia_1-Yersinia_cgMLST-500-BIGSdb-no_token	yersinia_1: Yersinia_cgMLST [500 loci] (BIGSdb-no_token)	/tmp/tmpomzcgw6j/galaxy-dev/tool-data/coreprofiler/coreprofiler_yersinia_1_no_token	coreprofiler_yersinia_1_no_token/db_yersinia_1/yersinia_1.fasta	coreprofiler_yersinia_1_no_token/scheme_yersinia_1
+coreprofiler_downloaded_17112025-yersinia_1-Yersinia_cgMLST-500-BIGSdb-no_token	yersinia_1: Yersinia_cgMLST [500 loci] (BIGSdb-no_token)	/tmp/tmphe1udqum/galaxy-dev/tool-data/coreprofiler/coreprofiler_yersinia_1_no_token	coreprofiler_yersinia_1_no_token/db_yersinia_1/yersinia_1.fasta	coreprofiler_yersinia_1_no_token/scheme_yersinia_1
+coreprofiler_downloaded_17112025-escherichia_v1-cgMLST-2513-enterobase-no_token	escherichia_v1: cgMLST [2513 loci] (enterobase-no_token)	/tmp/tmphe1udqum/galaxy-dev/tool-data/coreprofiler/coreprofiler_escherichia_v1_no_token	coreprofiler_escherichia_v1_no_token/db_escherichia_v1/escherichia_v1.fasta	coreprofiler_escherichia_v1_no_token/scheme_escherichia_v1
+coreprofiler_downloaded_17112025-borrelia_3-cgMLST-639-pubmlst-no_token	borrelia_3: cgMLST [639 loci] (pubmlst-no_token)	/tmp/tmp6njg6emw/galaxy-dev/tool-data/coreprofiler/coreprofiler_borrelia_3_no_token	coreprofiler_borrelia_3_no_token/db_borrelia_3/borrelia_3.fasta	coreprofiler_borrelia_3_no_token/scheme_borrelia_3
+coreprofiler_downloaded_17112025-yersinia_1-Yersinia_cgMLST-500-BIGSdb-no_token	yersinia_1: Yersinia_cgMLST [500 loci] (BIGSdb-no_token)	/tmp/tmp6njg6emw/galaxy-dev/tool-data/coreprofiler/coreprofiler_yersinia_1_no_token	coreprofiler_yersinia_1_no_token/db_yersinia_1/yersinia_1.fasta	coreprofiler_yersinia_1_no_token/scheme_yersinia_1
+coreprofiler_downloaded_17112025-escherichia_v1-cgMLST-2513-enterobase-no_token	escherichia_v1: cgMLST [2513 loci] (enterobase-no_token)	/tmp/tmp6njg6emw/galaxy-dev/tool-data/coreprofiler/coreprofiler_escherichia_v1_no_token	coreprofiler_escherichia_v1_no_token/db_escherichia_v1/escherichia_v1.fasta	coreprofiler_escherichia_v1_no_token/scheme_escherichia_v1
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tool-data/coreprofiler_scheme.loc.sample	Fri Nov 21 13:11:30 2025 +0000
@@ -0,0 +1,10 @@
+#This is a tab separated file describing the location of CoreProfiler scheme 
+#used for the CoreProfiler tool
+#
+#file has this format (white space characters are TAB characters)
+#
+#The columns are:
+#value	name	path	database	scheme
+#
+#For example
+#coreprofiler_downloaded_20250625_klebsiella_3_scgMLST634_632_loci_bigsdb	klebsiella_3: scgMLST634 [632 loci] (BIGSdb)	coreprofiler_klebsiella_3	coreprofiler_klebsiella_3/db_klebsiella_3/klebsiella_3.fasta	coreprofiler_klebsiella_3/scheme_klebsiella_3
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_data_table_conf.xml.sample	Fri Nov 21 13:11:30 2025 +0000
@@ -0,0 +1,6 @@
+<tables>
+    <table name="coreprofiler_scheme" comment_char="#">
+        <columns>value, name, path, database, scheme</columns>
+        <file path="tool-data/coreprofiler_scheme.loc"/>
+    </table>
+</tables>
\ No newline at end of file
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tool_data_table_conf.xml.test	Fri Nov 21 13:11:30 2025 +0000
@@ -0,0 +1,6 @@
+<tables>
+    <table name="coreprofiler_scheme" comment_char="#">
+        <columns>value, name, path, database, scheme</columns>
+        <file path="${__HERE__}/test-data/coreprofiler_scheme.loc.test"/>
+    </table>
+</tables>
\ No newline at end of file