Bladeren bron

Initial release

Ali 3 jaren geleden
bovenliggende
commit
8189480080

+ 13 - 0
AUTHORS.rst

@@ -0,0 +1,13 @@
+=======
+Credits
+=======
+
+Development Lead
+----------------
+
+* Ali BELLAMINE <contact@alibellamine.me>
+
+Contributors
+------------
+
+None yet. Why not be the first?

+ 128 - 0
CONTRIBUTING.rst

@@ -0,0 +1,128 @@
+.. highlight:: shell
+
+============
+Contributing
+============
+
+Contributions are welcome, and they are greatly appreciated! Every little bit
+helps, and credit will always be given.
+
+You can contribute in many ways:
+
+Types of Contributions
+----------------------
+
+Report Bugs
+~~~~~~~~~~~
+
+Report bugs at https://github.com/audreyr/py_thesis_toolbox/issues.
+
+If you are reporting a bug, please include:
+
+* Your operating system name and version.
+* Any details about your local setup that might be helpful in troubleshooting.
+* Detailed steps to reproduce the bug.
+
+Fix Bugs
+~~~~~~~~
+
+Look through the GitHub issues for bugs. Anything tagged with "bug" and "help
+wanted" is open to whoever wants to implement it.
+
+Implement Features
+~~~~~~~~~~~~~~~~~~
+
+Look through the GitHub issues for features. Anything tagged with "enhancement"
+and "help wanted" is open to whoever wants to implement it.
+
+Write Documentation
+~~~~~~~~~~~~~~~~~~~
+
+py_thesis_toolbox could always use more documentation, whether as part of the
+official py_thesis_toolbox docs, in docstrings, or even on the web in blog posts,
+articles, and such.
+
+Submit Feedback
+~~~~~~~~~~~~~~~
+
+The best way to send feedback is to file an issue at https://github.com/audreyr/py_thesis_toolbox/issues.
+
+If you are proposing a feature:
+
+* Explain in detail how it would work.
+* Keep the scope as narrow as possible, to make it easier to implement.
+* Remember that this is a volunteer-driven project, and that contributions
+  are welcome :)
+
+Get Started!
+------------
+
+Ready to contribute? Here's how to set up `py_thesis_toolbox` for local development.
+
+1. Fork the `py_thesis_toolbox` repo on GitHub.
+2. Clone your fork locally::
+
+    $ git clone git@github.com:your_name_here/py_thesis_toolbox.git
+
+3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development::
+
+    $ mkvirtualenv py_thesis_toolbox
+    $ cd py_thesis_toolbox/
+    $ python setup.py develop
+
+4. Create a branch for local development::
+
+    $ git checkout -b name-of-your-bugfix-or-feature
+
+   Now you can make your changes locally.
+
+5. When you're done making changes, check that your changes pass flake8 and the
+   tests, including testing other Python versions with tox::
+
+    $ flake8 py_thesis_toolbox tests
+    $ python setup.py test or pytest
+    $ tox
+
+   To get flake8 and tox, just pip install them into your virtualenv.
+
+6. Commit your changes and push your branch to GitHub::
+
+    $ git add .
+    $ git commit -m "Your detailed description of your changes."
+    $ git push origin name-of-your-bugfix-or-feature
+
+7. Submit a pull request through the GitHub website.
+
+Pull Request Guidelines
+-----------------------
+
+Before you submit a pull request, check that it meets these guidelines:
+
+1. The pull request should include tests.
+2. If the pull request adds functionality, the docs should be updated. Put
+   your new functionality into a function with a docstring, and add the
+   feature to the list in README.rst.
+3. The pull request should work for Python 3.5, 3.6, 3.7 and 3.8, and for PyPy. Check
+   https://travis-ci.com/audreyr/py_thesis_toolbox/pull_requests
+   and make sure that the tests pass for all supported Python versions.
+
+Tips
+----
+
+To run a subset of tests::
+
+
+    $ python -m unittest tests.test_py_thesis_toolbox
+
+Deploying
+---------
+
+A reminder for the maintainers on how to deploy.
+Make sure all your changes are committed (including an entry in HISTORY.rst).
+Then run::
+
+$ bump2version patch # possible: major / minor / patch
+$ git push
+$ git push --tags
+
+Travis will then deploy to PyPI if tests pass.

+ 8 - 0
HISTORY.rst

@@ -0,0 +1,8 @@
+=======
+History
+=======
+
+0.0.1 (2021-03-30)
+------------------
+
+* First release on PyPI.

+ 11 - 0
MANIFEST.in

@@ -0,0 +1,11 @@
+include AUTHORS.rst
+include CONTRIBUTING.rst
+include HISTORY.rst
+include LICENSE
+include README.rst
+
+recursive-include tests *
+recursive-exclude * __pycache__
+recursive-exclude * *.py[co]
+
+recursive-include docs *.rst conf.py Makefile make.bat *.jpg *.png *.gif

+ 85 - 0
Makefile

@@ -0,0 +1,85 @@
+.PHONY: clean clean-test clean-pyc clean-build docs help
+.DEFAULT_GOAL := help
+
+define BROWSER_PYSCRIPT
+import os, webbrowser, sys
+
+from urllib.request import pathname2url
+
+webbrowser.open("file://" + pathname2url(os.path.abspath(sys.argv[1])))
+endef
+export BROWSER_PYSCRIPT
+
+define PRINT_HELP_PYSCRIPT
+import re, sys
+
+for line in sys.stdin:
+	match = re.match(r'^([a-zA-Z_-]+):.*?## (.*)$$', line)
+	if match:
+		target, help = match.groups()
+		print("%-20s %s" % (target, help))
+endef
+export PRINT_HELP_PYSCRIPT
+
+BROWSER := python -c "$$BROWSER_PYSCRIPT"
+
+help:
+	@python -c "$$PRINT_HELP_PYSCRIPT" < $(MAKEFILE_LIST)
+
+clean: clean-build clean-pyc clean-test ## remove all build, test, coverage and Python artifacts
+
+clean-build: ## remove build artifacts
+	rm -fr build/
+	rm -fr dist/
+	rm -fr .eggs/
+	find . -name '*.egg-info' -exec rm -fr {} +
+	find . -name '*.egg' -exec rm -f {} +
+
+clean-pyc: ## remove Python file artifacts
+	find . -name '*.pyc' -exec rm -f {} +
+	find . -name '*.pyo' -exec rm -f {} +
+	find . -name '*~' -exec rm -f {} +
+	find . -name '__pycache__' -exec rm -fr {} +
+
+clean-test: ## remove test and coverage artifacts
+	rm -fr .tox/
+	rm -f .coverage
+	rm -fr htmlcov/
+	rm -fr .pytest_cache
+
+lint: ## check style with flake8
+	flake8 py_thesis_toolbox tests
+
+test: ## run tests quickly with the default Python
+	python setup.py test
+
+test-all: ## run tests on every Python version with tox
+	tox
+
+coverage: ## check code coverage quickly with the default Python
+	coverage run --source py_thesis_toolbox setup.py test
+	coverage report -m
+	coverage html
+	$(BROWSER) htmlcov/index.html
+
+docs: ## generate Sphinx HTML documentation, including API docs
+	rm -f docs/py_thesis_toolbox.rst
+	rm -f docs/modules.rst
+	sphinx-apidoc -o docs/ py_thesis_toolbox
+	$(MAKE) -C docs clean
+	$(MAKE) -C docs html
+	$(BROWSER) docs/_build/html/index.html
+
+servedocs: docs ## compile the docs watching for changes
+	watchmedo shell-command -p '*.rst' -c '$(MAKE) -C docs html' -R -D .
+
+release: dist ## package and upload a release
+	twine upload dist/*
+
+dist: clean ## builds source and wheel package
+	python setup.py sdist
+	python setup.py bdist_wheel
+	ls -l dist
+
+install: clean ## install the package to the active Python's site-packages
+	python setup.py install

+ 82 - 1
README.md

@@ -1,3 +1,84 @@
 # py_thesis_toolbox
 
-Outils d'analyse des données de thèses
+Outils d'analyse des données de thèses.
+Applique une description des données suivi d'une série de tests univariés.
+
+## Installation du plugin
+
+```
+    git clone https://gogs.alibellamine.me/alibell/py_thesis_toolbox.git
+    cd py_thesis_toolbox
+    pip install -r requirements.txt
+    pip install .
+```
+
+## Utilisation
+
+### Analyses d'un jeu de donnée
+
+L'analyse d'un jeu de données procède aux traitements suivant :
+- Analyse descriptive : moyenne, nombre de sujet, médiane, intervales inter-quartiles
+- Analyse explicative descriptive
+- Application de test lors de l'analyse explicative descriptive
+
+**@TODO : Implémenter la création d'un modèle multivariés**
+
+```
+    from thesis_analysis import analyseStatistiques
+
+    analyses = analyseStatistiques(df)
+    analyses.analyse_univarie(
+        variable_interet,
+        variables_explicatives
+    )
+```
+
+#### variable_interet
+
+La variable **variable d'intérêt** comprend un dictionnaire décrivant une liste de variables qualitatives ou quantitative.
+Le dictionnaire doit être de la forme :
+
+```
+{
+    nom_de_variable:type_de_variable["qualitative","quantitative"]
+    ...
+}
+```
+
+#### variables_explicatives
+
+Liste de variables explicatives.
+L'ensemble des variables doit être de type qualitatif.
+Il s'agit d'une liste :
+
+```
+[
+    nom_de_variable
+    ...
+]
+```
+
+### Application d'un test spécifique
+
+
+```
+    from thesis_analysis.test import testQualitatif, testQuantitatif
+
+    test = testQualitatif(df, y, x)
+    test.best_test()
+```
+
+Applique pour un jeu x et y le meilleur test possible.
+Paramètres :
+- df : DataFrame contenant l'ensemble du jeu de données
+- y : variable d'intérêt sur laquelle on mesure l'impact de la variable x
+- x : variable dont on mesure l'impact sur y
+
+La fonction **best_test** détermine le meilleur test applicable aux données.
+
+Il est possible d'executer une série de test manuellement.
+La liste des test peut être obtenue en éxecutant :
+
+```
+    dir(test)
+```

+ 25 - 0
README.rst

@@ -0,0 +1,25 @@
+=================
+py_thesis_toolbox
+=================
+
+
+
+
+
+
+Python Boilerplate contains all the boilerplate you need to create a Python package.
+
+
+
+Features
+--------
+
+* TODO
+
+Credits
+-------
+
+This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.
+
+.. _Cookiecutter: https://github.com/audreyr/cookiecutter
+.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage

+ 20 - 0
docs/Makefile

@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line.
+SPHINXOPTS    =
+SPHINXBUILD   = python -msphinx
+SPHINXPROJ    = py_thesis_toolbox
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

+ 1 - 0
docs/authors.rst

@@ -0,0 +1 @@
+.. include:: ../AUTHORS.rst

+ 162 - 0
docs/conf.py

@@ -0,0 +1,162 @@
+#!/usr/bin/env python
+#
+# py_thesis_toolbox documentation build configuration file, created by
+# sphinx-quickstart on Fri Jun  9 13:47:02 2017.
+#
+# This file is execfile()d with the current directory set to its
+# containing dir.
+#
+# Note that not all possible configuration values are present in this
+# autogenerated file.
+#
+# All configuration values have a default; values that are commented out
+# serve to show the default.
+
+# If extensions (or modules to document with autodoc) are in another
+# directory, add these directories to sys.path here. If the directory is
+# relative to the documentation root, use os.path.abspath to make it
+# absolute, like shown here.
+#
+import os
+import sys
+sys.path.insert(0, os.path.abspath('..'))
+
+import py_thesis_toolbox
+
+# -- General configuration ---------------------------------------------
+
+# If your documentation needs a minimal Sphinx version, state it here.
+#
+# needs_sphinx = '1.0'
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
+extensions = ['sphinx.ext.autodoc', 'sphinx.ext.viewcode']
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ['_templates']
+
+# The suffix(es) of source filenames.
+# You can specify multiple suffix as a list of string:
+#
+# source_suffix = ['.rst', '.md']
+source_suffix = '.rst'
+
+# The master toctree document.
+master_doc = 'index'
+
+# General information about the project.
+project = 'py_thesis_toolbox'
+copyright = "2021, Ali BELLAMINE"
+author = "Ali BELLAMINE"
+
+# The version info for the project you're documenting, acts as replacement
+# for |version| and |release|, also used in various other places throughout
+# the built documents.
+#
+# The short X.Y version.
+version = py_thesis_toolbox.__version__
+# The full version, including alpha/beta/rc tags.
+release = py_thesis_toolbox.__version__
+
+# The language for content autogenerated by Sphinx. Refer to documentation
+# for a list of supported languages.
+#
+# This is also used if you do content translation via gettext catalogs.
+# Usually you set "language" from the command line for these cases.
+language = None
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This patterns also effect to html_static_path and html_extra_path
+exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
+
+# The name of the Pygments (syntax highlighting) style to use.
+pygments_style = 'sphinx'
+
+# If true, `todo` and `todoList` produce output, else they produce nothing.
+todo_include_todos = False
+
+
+# -- Options for HTML output -------------------------------------------
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+#
+html_theme = 'alabaster'
+
+# Theme options are theme-specific and customize the look and feel of a
+# theme further.  For a list of options available for each theme, see the
+# documentation.
+#
+# html_theme_options = {}
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ['_static']
+
+
+# -- Options for HTMLHelp output ---------------------------------------
+
+# Output file base name for HTML help builder.
+htmlhelp_basename = 'py_thesis_toolboxdoc'
+
+
+# -- Options for LaTeX output ------------------------------------------
+
+latex_elements = {
+    # The paper size ('letterpaper' or 'a4paper').
+    #
+    # 'papersize': 'letterpaper',
+
+    # The font size ('10pt', '11pt' or '12pt').
+    #
+    # 'pointsize': '10pt',
+
+    # Additional stuff for the LaTeX preamble.
+    #
+    # 'preamble': '',
+
+    # Latex figure (float) alignment
+    #
+    # 'figure_align': 'htbp',
+}
+
+# Grouping the document tree into LaTeX files. List of tuples
+# (source start file, target name, title, author, documentclass
+# [howto, manual, or own class]).
+latex_documents = [
+    (master_doc, 'py_thesis_toolbox.tex',
+     'py_thesis_toolbox Documentation',
+     'Ali BELLAMINE', 'manual'),
+]
+
+
+# -- Options for manual page output ------------------------------------
+
+# One entry per manual page. List of tuples
+# (source start file, name, description, authors, manual section).
+man_pages = [
+    (master_doc, 'py_thesis_toolbox',
+     'py_thesis_toolbox Documentation',
+     [author], 1)
+]
+
+
+# -- Options for Texinfo output ----------------------------------------
+
+# Grouping the document tree into Texinfo files. List of tuples
+# (source start file, target name, title, author,
+#  dir menu entry, description, category)
+texinfo_documents = [
+    (master_doc, 'py_thesis_toolbox',
+     'py_thesis_toolbox Documentation',
+     author,
+     'py_thesis_toolbox',
+     'One line description of project.',
+     'Miscellaneous'),
+]
+
+
+

+ 1 - 0
docs/contributing.rst

@@ -0,0 +1 @@
+.. include:: ../CONTRIBUTING.rst

+ 1 - 0
docs/history.rst

@@ -0,0 +1 @@
+.. include:: ../HISTORY.rst

+ 20 - 0
docs/index.rst

@@ -0,0 +1,20 @@
+Welcome to py_thesis_toolbox's documentation!
+======================================
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents:
+
+   readme
+   installation
+   usage
+   modules
+   contributing
+   authors
+   history
+
+Indices and tables
+==================
+* :ref:`genindex`
+* :ref:`modindex`
+* :ref:`search`

+ 51 - 0
docs/installation.rst

@@ -0,0 +1,51 @@
+.. highlight:: shell
+
+============
+Installation
+============
+
+
+Stable release
+--------------
+
+To install py_thesis_toolbox, run this command in your terminal:
+
+.. code-block:: console
+
+    $ pip install py_thesis_toolbox
+
+This is the preferred method to install py_thesis_toolbox, as it will always install the most recent stable release.
+
+If you don't have `pip`_ installed, this `Python installation guide`_ can guide
+you through the process.
+
+.. _pip: https://pip.pypa.io
+.. _Python installation guide: http://docs.python-guide.org/en/latest/starting/installation/
+
+
+From sources
+------------
+
+The sources for py_thesis_toolbox can be downloaded from the `Github repo`_.
+
+You can either clone the public repository:
+
+.. code-block:: console
+
+    $ git clone git://github.com/audreyr/py_thesis_toolbox
+
+Or download the `tarball`_:
+
+.. code-block:: console
+
+    $ curl -OJL https://github.com/audreyr/py_thesis_toolbox/tarball/master
+
+Once you have a copy of the source, you can install it with:
+
+.. code-block:: console
+
+    $ python setup.py install
+
+
+.. _Github repo: https://github.com/audreyr/py_thesis_toolbox
+.. _tarball: https://github.com/audreyr/py_thesis_toolbox/tarball/master

+ 36 - 0
docs/make.bat

@@ -0,0 +1,36 @@
+@ECHO OFF
+
+pushd %~dp0
+
+REM Command file for Sphinx documentation
+
+if "%SPHINXBUILD%" == "" (
+	set SPHINXBUILD=python -msphinx
+)
+set SOURCEDIR=.
+set BUILDDIR=_build
+set SPHINXPROJ=py_thesis_toolbox
+
+if "%1" == "" goto help
+
+%SPHINXBUILD% >NUL 2>NUL
+if errorlevel 9009 (
+	echo.
+	echo.The Sphinx module was not found. Make sure you have Sphinx installed,
+	echo.then set the SPHINXBUILD environment variable to point to the full
+	echo.path of the 'sphinx-build' executable. Alternatively you may add the
+	echo.Sphinx directory to PATH.
+	echo.
+	echo.If you don't have Sphinx installed, grab it from
+	echo.http://sphinx-doc.org/
+	exit /b 1
+)
+
+%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
+goto end
+
+:help
+%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
+
+:end
+popd

+ 1 - 0
docs/readme.rst

@@ -0,0 +1 @@
+.. include:: ../README.rst

+ 7 - 0
docs/usage.rst

@@ -0,0 +1,7 @@
+=====
+Usage
+=====
+
+To use py_thesis_toolbox in a project::
+
+    import py_thesis_toolbox

+ 4 - 0
requirement.txt

@@ -0,0 +1,4 @@
+statsmodels
+scipy
+pandas
+numpy

+ 10 - 0
requirements_dev.txt

@@ -0,0 +1,10 @@
+pip==19.2.3
+bump2version==0.5.11
+wheel==0.33.6
+watchdog==0.9.0
+flake8==3.7.8
+tox==3.14.0
+coverage==4.5.4
+Sphinx==1.8.5
+twine==1.14.0
+

+ 22 - 0
setup.cfg

@@ -0,0 +1,22 @@
+[bumpversion]
+current_version = 0.0.1
+commit = True
+tag = True
+
+[bumpversion:file:setup.py]
+search = version='{current_version}'
+replace = version='{new_version}'
+
+[bumpversion:file:py_thesis_toolbox/__init__.py]
+search = __version__ = '{current_version}'
+replace = __version__ = '{new_version}'
+
+[bdist_wheel]
+universal = 1
+
+[flake8]
+exclude = docs
+
+[aliases]
+# Define setup.py command aliases here
+

+ 46 - 0
setup.py

@@ -0,0 +1,46 @@
+#!/usr/bin/env python
+
+"""The setup script."""
+
+from setuptools import setup, find_packages
+
+with open('README.rst') as readme_file:
+    readme = readme_file.read()
+
+with open('HISTORY.rst') as history_file:
+    history = history_file.read()
+
+requirements = [ ]
+
+setup_requirements = [ ]
+
+test_requirements = [ ]
+
+setup(
+    author="Ali BELLAMINE",
+    author_email='contact@alibellamine.me',
+    python_requires='>=3.5',
+    classifiers=[
+        'Development Status :: 2 - Pre-Alpha',
+        'Intended Audience :: Developers',
+        'Natural Language :: English',
+        'Programming Language :: Python :: 3',
+        'Programming Language :: Python :: 3.5',
+        'Programming Language :: Python :: 3.6',
+        'Programming Language :: Python :: 3.7',
+        'Programming Language :: Python :: 3.8',
+    ],
+    description="Python Boilerplate contains all the boilerplate you need to create a Python package.",
+    install_requires=requirements,
+    long_description=readme + '\n\n' + history,
+    include_package_data=True,
+    keywords='py_thesis_toolbox',
+    name='py_thesis_toolbox',
+    packages=find_packages(include=['py_thesis_toolbox', 'py_thesis_toolbox.*']),
+    setup_requires=setup_requirements,
+    test_suite='tests',
+    tests_require=test_requirements,
+    url='https://github.com/audreyr/py_thesis_toolbox',
+    version='0.0.1',
+    zip_safe=False,
+)

+ 1 - 0
tests/__init__.py

@@ -0,0 +1 @@
+"""Unit test package for py_thesis_toolbox."""

+ 21 - 0
tests/test_py_thesis_toolbox.py

@@ -0,0 +1,21 @@
+#!/usr/bin/env python
+
+"""Tests for `py_thesis_toolbox` package."""
+
+
+import unittest
+
+from py_thesis_toolbox import py_thesis_toolbox
+
+
+class TestPy_thesis_toolbox(unittest.TestCase):
+    """Tests for `py_thesis_toolbox` package."""
+
+    def setUp(self):
+        """Set up test fixtures, if any."""
+
+    def tearDown(self):
+        """Tear down test fixtures, if any."""
+
+    def test_000_something(self):
+        """Test something."""

+ 1 - 0
thesis_analysis/__init__.py

@@ -0,0 +1 @@
+from .analyseStatistiques import analyseStatistiques

+ 169 - 0
thesis_analysis/analyseStatistiques.py

@@ -0,0 +1,169 @@
+from .test import testQualitatif, testQuantitatif
+
+class analyseStatistiques ():
+    """
+        Permet l'analyse d'un jeu de données.
+        Analyses univariés :
+            Description des données
+            Application des tests
+
+        Input : dataset
+    """
+    
+    def __init__ (self, df):
+        # Chargement du dataframe
+        self.df = df
+        
+    def _describe_qualitative (self, data):
+        
+        """
+            Calculate n and p of each modalitie of the qualitative value
+            Input : data, Pandas Series containing data to describe
+        """
+                
+        table = data.value_counts()
+        description = pd.DataFrame({'n':table, 'p':table/table.sum()}) \
+            .to_dict("index")
+        description["total"] = table.sum()
+        
+        return(description)
+    
+    def _get_sub_table(self, variable, axes):
+        
+        # On sélectionne les données à analyse
+        if (axes is None):
+            temp_data = self.df[[variable]]
+        else:
+            temp_data = self.df[[variable]+axes]
+        
+        temp_data = temp_data.dropna()
+        
+        return(temp_data)
+        
+        
+    def _analyse_univarie_qualitative (self, variable, axes = None):
+        
+        temp_data = self._get_sub_table(variable, [])
+        
+        # On charge un dictionnaire vide
+        analyse = {}
+        
+        analyse["n"] = temp_data.shape[0]
+
+        ## Globale : en dehors de l'axe d'analyse
+        analyse["global"] = self._describe_qualitative(temp_data[variable])
+
+        ## Spécifique : Dans les axes d'analyse
+        if (axes is not None):
+                
+            analyse["sous_groupes"] = {}
+            analyse["test"] = {}
+            
+            for axe in axes:  
+                
+                temp_data = self._get_sub_table(variable, [axe])
+                
+                # Axe values
+                axe_values = temp_data[axe] \
+                    .drop_duplicates().values.tolist()
+                
+                # Description
+                analyse["sous_groupes"][axe] = {}
+                for values in axe_values:
+                    analyse["sous_groupes"][axe][values] = self._describe_qualitative(
+                        temp_data[
+                            temp_data[axe] == values
+                        ] \
+                        .reset_index(drop = True)[variable]
+                    )
+
+                # Test statistique
+                analyse["test"][axe] = testQualitatif(temp_data, variable,axe).best_test()
+
+        analyse["type"] = "qualitative"
+
+        return analyse
+    
+    def _describe_quantitative (self, data):
+        """
+            Calculate mean, median, Q25, 50, 27, std, std_mean and CI for quantitative data
+            Input : data, Pandas Series containing data to describe
+        """
+        
+        # Dict containing data
+        description = {}
+        
+        description["n"] = data.shape[0]
+        description["mean"] = data.mean()
+        description["median"] = data.median()
+        description["Q25"] = data.quantile(0.25)
+        description["Q75"] = data.quantile(0.75)
+        description["std"] = data.std()
+        description["std_mean"] = description["std"]/math.sqrt(description["n"])
+        description["ci_95"] = [description["mean"]-1.96*description["std_mean"], 
+                                description["mean"]+1.96*description["std_mean"]]        
+        
+        return description
+    
+    def _analyse_univarie_quantitative (self, variable, axes = None):
+        
+        # On sélectionne les données à analyse
+        temp_data = self._get_sub_table(variable, [])
+        
+        # On charge un dictionnaire vide
+        analyse = {}
+        
+        analyse["n"] = temp_data.shape[0]
+
+        ## Globale : en dehors de l'axe d'analyse
+        analyse["global"] = self._describe_quantitative(temp_data[variable])
+                
+        ## Spécifique : Dans les axes d'analyse
+        if (axes is not None):
+                
+            analyse["sous_groupes"] = {}
+            analyse["test"] = {}
+
+            for axe in axes:  
+                
+                temp_data = self._get_sub_table(variable, [axe])
+                
+                # Axe values
+                axe_values = temp_data[axe] \
+                    .drop_duplicates().values.tolist()
+                
+                # Description
+                analyse["sous_groupes"][axe] = {}
+                for values in axe_values:
+                    analyse["sous_groupes"][axe][values] = self._describe_quantitative(
+                        temp_data[
+                            temp_data[axe] == values
+                        ] \
+                        .reset_index(drop = True)[variable]
+                    )
+
+                # Test statistique
+                analyse["test"][axe] = testQuantitatif(temp_data, variable,axe).best_test()
+
+        analyse["type"] = "quantitative"
+
+        return analyse
+        
+    def analyse_univarie (self, variables, axes = None):
+        """
+            Analyse descriptive univariée
+                variable : dictionnaire contenant la liste des variables à analyser de la forme :
+                    Key = Nom de la variable, Value : type de variable : quantitative ou qualitative
+                axes : axes d'analyse de la variable, doivent être de type qualitative. Liste de variables.                
+        """
+        
+        # Sortie des résultats
+        resultats = {}
+        
+        for variable, type_variable in variables.items():
+            if type_variable == 'qualitative':
+                resultats[variable] = self._analyse_univarie_qualitative(variable, axes)
+            elif type_variable == 'quantitative':
+                resultats[variable] = self._analyse_univarie_quantitative(variable, axes)                
+                
+        return(resultats)

+ 2 - 0
thesis_analysis/test/__init__.py

@@ -0,0 +1,2 @@
+from .test_qualitatif import testQualitatif
+from .test_quantitatif import testQuantitatif

+ 111 - 0
thesis_analysis/test/test_qualitatif.py

@@ -0,0 +1,111 @@
+import pandas as pd
+import numpy as np
+from scipy.stats import chi2_contingency, fisher_exact
+from scipy.stats import normaltest, kstest, levene, ttest_ind, mannwhitneyu, f_oneway, kruskal
+import statsmodels.stats.weightstats as ws
+
+# Qualitatif
+
+class testQualitatif ():
+    """
+        Applique le test qualitatif le plus adapté à partir d'un jeu de données
+            df : jeu de données
+            y : variable à tester
+            x : variable dont on souhaite mesurer l'impact
+    """
+    
+    def __init__ (self, df, y, x):
+        
+        # Calcul du tableau de contingence
+        self.contingency = self._get_contingency(df, y, x)
+        
+    def _get_contingency (self, df, y, x):
+        
+        contingency = pd.DataFrame(
+            {"n":df.groupby(x)[y].value_counts()}
+        ).reset_index() \
+        .pivot_table(index = [x], columns = [y]) \
+        .fillna(0)
+        contingency.columns = contingency.columns.droplevel(0)
+        
+        return(contingency.values.astype(int))
+        
+    def best_test (self):
+        
+        """
+            Selectionne le meilleur test possible
+        """
+        
+        # Ordre de priorité
+        ## 1. Khi2
+        ## 2. Khi2 - Yates
+        ## 3. Student test
+        ## 4. Absence de test
+        
+        order_test = {
+            "khi2":[self.khi2,[False]],
+            "khi2_yates":[self.khi2,[True]],
+            "fisher":[self.fisher,[]],
+            "no_test":[self._no_test, []]
+        }
+        
+        ## Application des tests dans l'ordre
+        for test_name, test in order_test.items():
+            test_result = test[0](*test[1])
+
+            if (test_result["valid"] == True):
+                return (test_name, test_result)
+        
+    def khi2 (self, yates_correction = False):
+        """
+            Test du Khi-2
+            Paramètre :
+                yates_correction : Si True, effectue la correction de Yates
+        """
+        
+        # Application du test
+        test_result = chi2_contingency(self.contingency, correction=yates_correction)
+        
+        # Validité du test
+        if yates_correction == False:
+            khi2_valid = len([True for x in test_result[3] for y in x if y < 5]) == 0
+        else:
+            khi2_valid = (len([True for x in test_result[3] for y in x if y < 3]) == 0) \
+                        & (test_result[2] == 1)
+        
+        # Structuration du résultat
+        output_result = dict(zip(
+            ["statistic","p_value", "dof", "theorical_values","observed_values","yates_correction", "valid"],
+            list(test_result)+[self.contingency, yates_correction, khi2_valid]
+        ))
+        
+        return output_result
+    
+    def fisher (self):
+        """
+            Test de Fisher
+        """
+        
+        # Vérification des CI
+        valid = (self.contingency.shape == (2,2))
+        
+        if (valid == True):
+            fisher_result = fisher_exact(self.contingency)
+            
+            output_result = dict(zip(
+                ["statistic", "p_value", "observed_values","valid"],
+                list(fisher_result)+[self.contingency, valid]
+            ))
+        else:
+            output_result = {"valid":valid}
+            
+        return output_result
+    
+    def _no_test (self):
+        """
+            Retourne l'absence de test
+        """
+        
+        output_result = {"valid":True}
+        
+        return output_result

+ 193 - 0
thesis_analysis/test/test_quantitatif.py

@@ -0,0 +1,193 @@
+import pandas as pd
+import numpy as np
+from scipy.stats import chi2_contingency, fisher_exact
+from scipy.stats import normaltest, kstest, levene, ttest_ind, mannwhitneyu, f_oneway, kruskal
+import statsmodels.stats.weightstats as ws
+
+class testQuantitatif ():
+    """
+        Applique le test qualitatif le plus adapté à partir d'un jeu de données
+            df : dataframe contenant les données
+            y : variable à tester
+            x : variable dont on souhaite mesurer l'impact
+    """
+    
+    def __init__ (self, df, y, x):
+        
+        # Calcul du tableau de contingence
+        self.df = df
+        self.x = x
+        self.y = y
+        
+        # On détermines les modalités de x
+        self.x_shapes = self.df[x].drop_duplicates()
+        
+        # On détermine les valeurs de y pour chaque x
+        self.y_values = {}
+        for x_value in self.x_shapes:
+            self.y_values[x_value] = df[
+                            df[x] == x_value
+                        ][y] \
+                        .values
+            
+        # On détermine les éléments de validités
+        self.x_shape = self.x_shapes.shape[0]
+        self.n_sup_30 = ((df.groupby(x).count()[y] >= 30).sum() == 2)
+        
+    def _check_all_group_normal(self):
+        
+        # Application du test de normalité
+        normal_test_result = self.normal_distribution();
+        
+        # On vérifie que tous les groupes sont distribués normalement
+        valid = len([False for x in normal_test_result.values() if x["p_value"] < 0.05]) == 0
+        
+        return valid
+        
+    def best_test (self, normal = None):
+        
+        """
+            Selectionne le meilleur test possible
+            
+            Variable :
+                normal : boolean, si True, on suppose une distribution normale sans faire de test de normalité
+        """
+        
+        # Arbre decisionnel
+        ## x_shape :
+        ### 2 :
+        #### N1, N2 >= 30 : Z-test
+        #### N1, N2 <= 30 : 
+        ##### Distribution de chaque groupe normale ?
+        ###### Egalite variance : t-test
+        ###### Absence égalité variance : test de welch
+        ##### Absence de distribution normale : MWWilcoxon
+        ### > 2 :
+        #### Egalite variance et distribution normal : ANOVA
+        #### Autrement un test de Kruskal Wallis
+        
+        if self.x_shape == 2:
+            if self.n_sup_30:
+                # Application du Z-test
+                result = self.z_test()
+                test_applied = "z_test"
+            else:
+                # Vérification de la normalité
+                if normal or self._check_all_group_normal():
+                    # Vérification de l'égalité des variance
+                    if self.variance_equity["p_value"] >= 0.05:
+                        # On suppose l'égalité de variance : t-test
+                        welch = False
+                        test_applied = "t_test"
+                    else:
+                        # On ne suppose pas l'égalité de variance : test de welch
+                        welch = True
+                        test_applied = "t_test Welch"
+                        
+                    result = self.t_test(welch)
+                else:
+                    test_applied = "Mann-Whitney Wilcoxon"
+                    result = self.mwwilcoxon()
+        else:
+            # Vérification de la normalité et de l'égalité de variance
+            if (normal or self._check_all_group_normal()) and (self.variance_equity["p_value"] >= 0.05):
+                # Anova
+                test_applied = "ANOVA_1W"
+                result = self.anova_1w()
+            else:
+                test_applied = "Kurskal_Wallis"
+                result = self.kruskal_wallis()
+                
+        return (test_applied, result)
+                
+    def _apply_test (self, test_function, params = {}):
+        
+        # Application du test
+        test_result = test_function(*list(self.y_values.values()),
+                                **params)
+        
+        output_result = dict(zip(
+            ["statistic","p_value"],
+            [test_result.statistic, test_result.pvalue]
+        ))
+        
+        return(output_result)
+        
+    def normal_distribution (self):
+        
+        # On applique le test de normalité à chaque catégorie
+        norm_test_result = dict(zip(
+            self.y_values.keys(),
+            [   dict(zip(
+                    ["statistic", "p_value"],
+                    [x.statistic, x.pvalue]
+                ))
+                for x in
+                [kstest(y_values, "norm") for y_values in self.y_values.values()]         
+            ]
+        ))
+            
+        return (norm_test_result)
+    
+    def z_test (self):
+        
+        # Application du test
+        test_result = ws.ztest(*list(self.y_values.values()))
+        
+        output_result = dict(zip(
+            ["statistic","p_value"],
+            list(test_result)
+        ))
+        
+        return(output_result)
+
+    def t_test (self, welch = False):
+        
+        # Application du test
+        output_result = self._apply_test(ttest_ind, {"equal_var":welch})
+        
+        return(output_result)
+    
+    def mwwilcoxon (self):
+        
+        # Application du test
+        output_result = self._apply_test(mannwhitneyu)
+        
+        return(output_result)
+    
+    def variance_equity (self):
+        
+        """
+            Test de Levene mesuré sur la moyenne
+        """
+        
+        # Application du test
+        output_result = self._apply_test(levene,{"center":"mean"})
+        
+        return(output_result)
+    
+    def anova_1w (self):
+        
+        """
+            ANOVA - One way : compare la variance de tous les groupes
+        """
+        
+         # On applique le test 
+        output_result = self._apply_test(f_oneway)
+        
+        return(output_result)
+        
+    def kruskal_wallis (self):
+         # On applique le test 
+        output_result = self._apply_test(kruskal)
+        
+        return(output_result)
+    
+    def _no_test (self):
+        """
+            Retourne l'absence de test
+        """
+        
+        output_result = {"valid":True}
+        
+        return output_result

+ 20 - 0
tox.ini

@@ -0,0 +1,20 @@
+[tox]
+envlist = py35, py36, py37, py38, flake8
+
+[travis]
+python =
+    3.8: py38
+    3.7: py37
+    3.6: py36
+    3.5: py35
+
+[testenv:flake8]
+basepython = python
+deps = flake8
+commands = flake8 py_thesis_toolbox tests
+
+[testenv]
+setenv =
+    PYTHONPATH = {toxinidir}
+
+commands = python setup.py test