Software by Dyliss

AskOmics
- Functional description:
  
  AskOmics aims at bridging the gap between end user data and the Linked (Open) Data cloud. It allows heterogeneous bioinformatics data (formatted as tabular files) to be loaded in a RDF triplestore and then be transparently and interactively queried.
  AskOmics is made of three software blocks: (1) a web interface for data import, allowing the creation of a local triplestore from user's datasheets and standard data, (2) an interactive web interface allowing "à la carte" query-building, (3) a server performing interactions with local and distant triplestores (queries execution, management of users parameters).
- Website:
  
  https://askomics.org/
Meneco
- Functional description:
  
  It is a qualitative approach to elaborate the biosynthetic capacities of metabolic networks. In fact, large-scale metabolic networks as well as measured datasets suffer from substantial incompleteness. Moreover, traditional formal approaches to biosynthesis require kinetic information, which is rarely available. Our approach builds upon formal systems for analyzing large-scale metabolic networks. Mapping its principles into Answer Set Programming allows us to address various biologically relevant problems.
- Website:
  
  https://github.com/bioasp/meneco
Fluto
- Functional description:
  
  Fluto relies on Answer Set Programming (ASP) and a hybrid modelling that associates to ASP a Linear Programming (LP) constraint propagator. Models satisfying the qualitative constraints of network expansion are tested for satisfiability of flux constraints with the LP propagator. Resulting answer sets permit the completion of a metabolic network that ensures the metabolic reaction of interest is activated according to both formalisms.
- Website:
  
  https://github.com/cfrioux/fluto/
MiSCoTo
- Functional description:
  
  Metabolic networks are composed of biochemical reactions and gather the expected metabolic capabilities of species. For organisms that live in interaction altogether (microbiotas), complementarity between these networks can be exploited to predict cooperation events. This software takes as inputs metabolic networks for various species (host, symbionts of the microbiota), components of the growth medium and a metabolic objective (metabolites to be produced), and aims at selecting a minimal set of symbionts to ensure the metabolic objective can be achieved. The software can use two types of modelings: a simplified one and another that takes into account the cost of metabolic exchanges and aims at minimizing it.
- Website:
  
  https://github.com/cfrioux/miscoto
PADMet Package
- Functional description:
  
  The PADMet package allows conciliating genomics and metabolic network information used to produce a genome-scale constraint-based metabolic model within a database that traces all the reconstruction process steps. It allows representing the metabolic model in the form of a Wiki and reports, containing all the used/traced information. The main concept underlying PADMet-Package is to provide solutions that ensure the consistency, the internal standardization and the reconciliation of the information used within any workflow that combines several tools involving metabolic networks reconstruction or analysis.
- Website:
  
  https://github.com/AuReMe/padmet
AuReMe
- Functional description:
  
  AuReMe enables the reconstruction of metabolic networks from different sources based on sequence annotation, orthology, gap-filling and manual curation. The metabolic network is exported as a local wiki allowing to trace back all the steps and sources of the reconstruction. It is highly relevant for the study of non-model organisms, or the comparison of metabolic networks for different strains or a single organism.
  
  Five modules are composing AuReMe: 1) The Model-management PADmet module allows manipulating and traceing all metabolic data via a local database. 2) The meneco python package allows the gaps of a metabolic network to be filled by using a topological approach that implements a logical programming approach to solve a combinatorial problem 3) The shogen python package allows genome and metabolic network to be aligned in order to identify genome units which contain a large density of genes coding for enzymes, it also implements a logical programming approach. 4) The manual curation assistance PADmet module allows the reported metabolic networks and their metadata to be curated. 5) The Wiki-export PADmet module enables the export of the metabolic network and its functional genomic unit as a local wiki platform allowing a user-friendly investigation.
- Website:
  
  http://aureme.genouest.org/
BioASP
- Functional description:
  
  ASP4 biology and BioASP It is a meta-package to create a powerful environment of biological data integration and analysis in system biology, based on knowledge representation and combinatorial optimization technologies (ASP). It provides a collection of python applications which encapsulates ASP tools and several encodings making them easy to use by non-expert users out-of-the-box.
- Website:
  
  https://bioasp.github.io/
CADBIOM
- Functional description:
  
  The Cadbiom software provides a formal framework to help the modeling of biological systems such as cell signaling network with Guarder Transition Semantics. It allows synchronization events to be investigated in biological networks among large-scale network in order to extract signature of controllers of a phenotype.
  Three modules are composing Cadbiom. 1) The Cadbiom graphical interface is useful to build and study moderate size models. It provides exploration, simulation and checking. For large-scale models, Cadbiom also allows to focus on specific nodes of interest. 2) The Cadbiom API allows a model to be loaded, performing static analysis and checking temporal properties on a finite horizon in the future or in the past. 3) Exploring large-scale knowledge repositories, since the translations of the large-scale PID repository (about 10,000 curated interactions) have been translated into the Cadbiom formalism.
- Website:
  
  http://cadbiom.genouest.org
pax2graphml
- Functional description:
  
  PAX2GRAPHML is an open source python library that allows to easily manipulate BioPAX source files as regulated reaction graphs described in .graphml format. PAX2GRAPHML is highly flexible and allows generating graphs of regulated reactions from a single BioPAX source or by combining and filtering BioPAX sources. Supporting the graph exchange format .graphml, the large-scale graphs produced from one or more data sources can be further analyzed with PAX2GRAPHML or standard python and R graph libraries.
- Website:
  
  https://pax2graphml.genouest.org/
caspo
- Functional description:
  
  Cell ASP Optimizer (caspo) constitutes a pipeline for automated reasoning on logical signaling networks. The main underlying issue is that inherent experimental noise is considered, many different logical networks can be compatible with a set of experimental observations.
  Five modules are composing Caspo: 1) the Caspo-learn module performs an automated inference of logical networks from experimental data allows for identifying admissible large-scale logic models saving a lot of efforts and without any a priori bias. 2) The Caspo-classify, predict and visualize modules allows for classifying a family of boolean networks with respect to their input-output predictions. 3) The Caspo-design module designs experimental perturbations which would allow for an optimal discrimination of rival models in a family of boolean networks. 4) The Caspo-control module identifies key-players of a family of networks: it computes robust intervention strategies that force a set of target species or compounds into a desired steady state. 5) The Caspo-timeseries module to take into account time-series observation datasets in the learning procedure.
- Website:
  
  http://bioasp.github.io/caspo/
Logol
- Functional description:
  
  Logol is a pattern matching grammar language and a set of tools to search a pattern in a biological (nucleic or amino acid) sequence. It allows the design of sophisticated patterns (by way of a an high level grammatical formalism), and their search in large sequences. The LogolMatch tool takes as input a biological sequence, DNA, RNA or protein, and a grammar file. It returns a result file containing the matches with all required details.
  
  Two modules are composing Logol. First, the Graphical designer allows a complex pattern to be iteratively built based on basic graphical patterns. The associated grammar file is an export of the graphical designer. Second, the LogolMatch parser takes as input a biological sequence and a grammar file. It returns a XML file containing all the occurrences of the pattern in the sequence with their parsing details. The input sequences can be genomes from biological banks.
- Website:
  
  http://logol.genouest.org/
Protomata
- Functional description:
  
  This tool is a grammatical inference framework suitable for learning the specific signature of a functional protein family from unaligned sequences by partial and local multiple alignment and automata modelling. It performs a syntactic characterization of proteins by identification of conservation blocks on sequence subsets and modelling of their succession. Possible fields of application are new members discovery or study (for instance, for site-directed mutagenesis) of, possibly non-homologous, functional families and subfamilies such as enzymatic, signalling or transporting proteins.
  
  Given a sample of sequences belonging to a structural or functional family of proteins, Protomata-Learner infers an automaton characterizing the family by partial local alignment of the sequences. Automata are graphical models representing a (potentially infinite) set of sequences. Able to express alternative local dependencies between the positions, automata offer a finer level of expressivity than classical sequence patterns (such as PSSM, Profile HMM, or Prosite Patterns) and can model more than homologous sequences. They are well suited to get new insights into a family or to search for new family members in the sequence data banks, especially when approaches based on classical multiple sequence alignments are insufficient.
  
  The three main modules integrated in the Protomata-learner workflow are available as well as stand-alone programs: 1) paloma builds partial local multiple alignments, 2) protobuild infers automata from these alignements and 3) protomatch and protoalign scans, parses and aligns new sequences with learnt automata. The suite is completed by tools to handle or visualize data and can be used online by the biologists via a web interface on Genouest Platform.
- Website:
  
  http://tools.genouest.org/tools/protomata/
PowerGrASP
- Functional description:
  
  Implementation of graph compression methods oriented toward visualization, and based on power graph analysis.
- Website:
  
  http://powergrasp.bourneuf.net/
PPsuite
- Functional description:
  
  This suite contains the following tools :
  – MakePotts infers a Potts model from a sequence or a multiple sequence alignment
  – PPalign aligns Potts models and corresponding sequences
  – VizPotts allows to visualize inferred Potts models and VizContacts allows to visualize inferred couplings with respect to actual contacts in a 3D protein structure.
- Website:
  
  https://www-dyliss.irisa.fr/ppalign/
VIRALpro
- Functional description:
  
  VIRALpro is a predictor capable of identifying capsid and tail protein sequences using support vector machines (SVM) with an accuracy estimated to be between 90% and 97%. Predictions are based on the protein amino acid composition, on the protein predicted secondary structure, as predicted by SSpro, and on a boosted linear combination of HMM e-values obtained from 3,380 HMMs built from multiple sequence alignments of specific fragments – called contact fragments – of both capsid and tail sequences.
- Website:
  
  http://scratch.proteomics.ics.uci.edu/
Transformer Framework for Protein Characterization – EnzBert
- Functional description:
  
  Given examples of annotated sequences, this tool allows to train and analyse resulting models with respect to evaluation metrics (accuracy, correlation) plots and the importance of the residues for the inference. The process is fully automated and the whole operation can be done by modifying a JSON configuration file and providing a JSON data set. No code skills are thus required.
  This framework enabled training the EnzBert model which predicts the Enzyme Commission number of protein sequences.
- Website:
  
  https://gitlab.inria.fr/nbuton/tfpc
FUSE-PhyloTree
- Functional description:
  
  FUSE-PhyloTree is dedicated to studying multi-functional protein families, such as paralogous and multi-domain protein families. It enables the association of functions with local sequence conservations through the inference and exploration of their ancestral co-appearance within the evolutionary tree of genes.
- Website:
  
  https://github.com/OcMalde/fuse-phylotree