Getting Started with High-Performance Data Analytics (HPDA)

Getting Started with High-Performance Data Analytics (HPDA)

Getting Started with High-Performance Data Analytics
  • Date de début 25 octobre 2021
  • Durée 12 hours
  • Lieu Maison du Savoir
    2, avenue de l’Université
    L-4365 Esch-sur-Alzette
  • Langue Anglais
  • Prix HT 1360.00 
S'inscrire

Contexte de la formation

Underlining Luxembourg’s data-driven innovation strategy, LuxProvide and the Competence Centre collaborate on an exclusive training catalogue related to High Performance Computing and MeluXina, Luxembourg’s brand-new supercomputer.

Bringing Data Analytics to the Next Level with MeluXina Supercomputer

This course is about obtaining working knowledge of some of the core python libraries used in proof-of-concept and prototyping and understand the structure of a Data Science project. Furthermore, to gain hands-on knowledge of TensorFlow library for machine learning, Deep learning, as well as statistical visualisation (Seaborn). Finally, to become familiar with distributed computing and Big Data concepts and their implementation using Horovod.

Objectifs

At the end of this course, the successful attendee will

  • have knowledge about
    • Python notebooks and how the computation is mapped onto hardware infrastructure
    • effective data-science project workflows
    • common data analytics Python libraries and their strengths and weaknesses
    • Big Data problems using distributed computing
  • and be able to
    • work with Python notebooks on MeluXina
    • read in a data set from file or object storage for analysis
    • make statistical analysis on data in a NumPy array or in a Panda dataframe
    • make visualisations of data using modern libraries
    • define, train and evaluate simple machine learning models TensorFlow
    • choose the suitable data analytics library for the job to be done
  • in order to
    • independently analyse and visualise data sets of any size on MeluXina

Programme de la formation

This training is divided into 3 modules, each lasting about 4 hours.

TensorFlow and Python – 25th October 2021

  • Core TensorFlow library for Data analytics
    • Python notebooks on MeluXina
    • A primer on Vectorisation: The Numpy library
      • Working with arrays
      • A word on efficiency
    • DataFrames: Working with Pandas
      • Series and dataframes
      • Dataframe manipulation
    • Break-down of a data analytics project
      • Data (gathering, pre-processing, exploration, enhancement)
      • Model (training, validation, prediction, visualisation, deployment)

TensorFlow in action – 26th October 2021 (a.m.)

  • Visualisation, Machine Learning, and Deep Learning with TensorFlow
    • Visualising data with Seaborn
      • Statistical plotting with Seaborn
    • Supervised Learning with TensorFlow
      • Description of the problem
      • Regression
      • Classification

Introduction to distributed computing – 26th October 2021 (p.m.)

  • Distributed Computing and Big Data
    • Introduction to distributed computing
      • Distributed Computing & training with Tensorflow
    • Machine learning on Big Data: Using Spark (PySpark)
      • Intro to RDDs
      • Technical evaluation
      • Machine Learning

Intervenants

From LuxProvide :

  • Dr Alban ROUSSET
  • Dr Farouk MANSOURI
  • Dr Luis VELA
  • Dr Matthieu LEFEBVRE
  • Dr Wahid MAINASSARA

 

Contact

If you have any question regarding the training, feel free to contact:
Pierre De La Celle
pierre.delacelle@competence.lu/ +352 26 15 92 43

Partager cette formation sur :