We continue telling you about the projects of the EPAM training center laboratory. This time, we interviewed Mariia Nesvit, Senior Software Engineer and Technical Lead of the CROMWELL project in the laboratory of EPAM training center. We also talked with the students of the laboratory, who shared an outward glance of their participation in the project and the details of learning Scala.
What is the training center? What does its laboratory do?
The EPAM Training Center is a division of the company responsible for training novice IT specialists. Most junior specialists come to the company through training at the training center.
To get started, you need not only to have the technical knowledge but also to have experience in working with the tools used for development, to know the basic processes of work on the project. First, students attend trainings, where they get acquainted with the theory, and then go to the laboratory, where they gain practical experience working on their educational projects.
We have already told you about training in the laboratory of the training center in more detail in one of the previous materials. Today we invite you to get acquainted with the CROMWELL student project.
About the CROMWELL project
The basis of the educational project is the open-source data of the CROMWELL project. This project helps scientists to process specific scientific data. The team from the Broad Institute, an American institute dealing with bioinformatics, developed this project. The Institute conducts researches in different areas: oncology, infectious diseases, the study of metabolism, epigenomics, population genetics, and others.
In case when the information is voluminous (for example, scientists conduct a study of genome files), the capacity of the local machine is not enough. Then, the processing of this data must be started in the cloud. There are special languages WDL and CWL in which a researcher can describe what needs to be done with the data. Then this file with the script is loaded into the CROMWELL application and processed in the cloud. However, CROMWELL can work with different types of clouds: AWS, Google Cloud, Alibaba Cloud. In this case, specialists are not bound to any cloud technology provider, and they do not need to have specific programming knowledge. They need to master some DSL (Domain Specific Language), i.e. a simple language, created specifically for their scientific field, to understand the syntax of this language, and then the application will launch the necessary calculations and provide the result.
Problem
CROMWELL is an open-source product and available to view on Git Hub. The problem this tool solves in its original version is to simplify the processing of scientific data on different platforms.
At the very beginning, a team of students and trainers from the EPAM training laboratory worked directly with the CROMWELL application and interacted with the development team from the Broad Institute. Then, based on the existing project, we decided to make a new project—more understandable for students starting to learn Scala.
As a result, the team of the training center developed an application providing data for the CROMWELL system but having a more user-friendly interface. This application is available when working with the command line and allows you to store statistics and data submitted to the application in input and output. Accordingly, it can be useful to people who use the main CROMWELL application.
Now, the training is structured in such a way that laboratory students first perform tasks on the internal project called the CROMWELL pipeline, and then, having mastered the libraries and writing Scala code, they move on to the open-source projects.
Now, in the laboratory, you can work with several libraries actively used in the Scala community, for example, the Cats library, ZIO, ZIO-config, and the Dotti project.
Why is the project interesting?
For the laboratory, the CROMWELL project is not quite ordinary since it is written in Scala, a JVM programming language that supports both object-oriented and functional paradigms.
Back-end developers mainly work with Java, therefore, coming to evening trainings, students receive theory and first assignments in this language. However, at the laboratory, they can either continue learning Java on a training project or come to CROMWELL and learn Scala.
Andrey (laboratory graduate):
When our training began, we had to learn to use the language in two weeks. Functional programming is very different from the object-oriented programming we have got used to in Java. In Scala, everything is an object. I like Scala because everything is much shorter. In Scala, you can make the code much more “beautiful.”
Grigory (laboratory graduate):
Scala is very different from Java, although it is written in JVM. Now, it seems to me that writing in Scala is much more convenient. When I get back to Java, I notice that my way of thinking changes a lot as for code writing. Sometimes, when solving problems, I catch myself thinking, in Scala, it would be possible to shorten it even more, but in Java, I cannot do this.
Project technology stack
The project is written in Scala using the Akka-HTTP and Slick libraries. We used sbt to build the project, ScalaTest and Mockito to test it. Also, students and mentors use GitLab, PostgreSQL, and MongoDB in the work process.
Development methodology on the project
The CROMWELL project uses the SCRUM methodology. The development process is divided into intervals—sprints. They usually last two weeks. Tasks to be completed during this period are planned and assigned in advance. Based on the results of the sprint, a retrospective is held, during which the team discusses what was done well and what could be improved or changed.
Students are in continuous contact with mentors. At daily short online meetings, they discuss the status of the work of each of the team members, and at weekly meetings, they all together analyze the problems and their solutions.
Andrey:
I like the team. We have a homely atmosphere. Mentors try to educate and develop us. For example, every Tuesday we prepare a solution to a problem. We select it in advance on HackerRank and implement it in Scala. At the meeting, we analyze each other’s solutions and discuss how it could be done even better.
How the workflow is built
After understanding the Scala syntax, students begin working on an internal CROMWELL pipeline project. Depending on the problems, students solve them either independently or in groups.
The code is loaded in open-source projects on behalf of the company, so it needs to be of proper quality. First, the student’s solution is reviewed, discussed, and corrected by two other team members. Then, two mentors review the code. After verification, the problem is uploaded to the main working repository of the CROMWELL pipeline or sent as a request to open-source projects.
Andrey:
In the process of working on the project, we used a creative approach to tasks. For example, to write a “beautiful” solution, one could find an interesting library among open source and offer to build it into the project. The mentors go through your solution, agree, and implement it. It turns out that you come with your solution to a problem, and it works. It’s always pleasant.
Project team
Now, three mentors are actively working on the project, they help students in learning Scala, project technologies, conduct code reviews, give lectures for students, if necessary, and help them with tasks in general.
Tech Leads coordinate most of the project activities. They ensure that all team members have the tasks and resources needed to solve them and try to help if there are any difficulties. They also communicate with the staff of the training center who is responsible for hiring, discuss issues of further allocation of people to future production projects with members of the Scala competency center.
Now, there are two technical leads on the project, among whom tasks are distributed: Mariia Nesvit coordinates open source activities, and Ilia Onishchenko manages work on the CROMWELL pipeline.
Students usually study in groups of up to five people, work on different tasks, but in general, the work is distributed so that everyone works with all parts of the project.
The atmosphere in the training center laboratory
Andrey:
In our team, everyone can ask for help from any person at any time regarding any task. For example, when I find out that we have achieved some results with AWS and it will be of interest to others, we arrange a joint call so that everyone is informed. This is not a task, but everyone may be interested, everyone joins in with pleasure. We have a very friendly team.
Yaroslav (laboratory student):
Our leads create a very pleasant friendly atmosphere. Every day we have an online meeting, it is called “daily.” Sometimes I may join such a meeting not completing my task and feel nervous about it. In this case, our leads can motivate us, tell us a joke that will clear the air which helps me a lot and gives me the strength to work further.
Differences in work between real and student project
Usually, the customer sets certain tasks. In the case of open-source projects, this happens rarely. The teams working on CROMWELL have more freedom. For example, there is an opportunity to independently decide on a library with which you would like to work to understand it at a deeper level over time.
Andrey:
I have worked on two student projects. When I moved to production, it turned out that, in general, everything is the same: people are ready to communicate and help. But really, everything is more serious, and the work is going much faster. If student projects have time, mentors can go through your code, evaluate how correctly and logically everything is written. On a real project, there are times when an hour passes, and you find out that your request is already in use.
The future of the project
Further, we are planning to develop cooperation with open-source libraries. So far, we do not have a lot of experience of interacting with them but working with open-source projects is useful to students, as for future specialists in the company.
Tips for novice specialists
Young specialists should communicate more with each other, within the team and discuss issues with other students. Looking for an answer together is more useful than just getting a ready-made solution.
Do not be afraid to clarify any information. However, as experience shows, you need to work on the wording. It is important to put a question so that you could understand it without the context of the current situation. Then, another person will get it. Sometimes, it is helpful to ask questions via chat. In this case, it is possible to read everything once again and see how clear the question is to others.
Do not be shy about giving feedback to other people. Sometimes, a person may lack information and experience to find the right solution. In this case, your response can be very useful. It is more pleasant to work in a team where there is an active exchange of information and experience, and relations between people are open.