GitLab : A New Learning experience

Hi There! Hope you are doing good!

So, for my current project work I am learning GitLab for utilizing DevOps methodology for ETL pipelines in existing system. Here I am sharing my experience as I learn GitLab with you.

Hope you find it useful. 

So, let's get started...

First and foremost...

What is GitLab?


It is a web based CI/CD tool for version control of your projects at the GitLab server.

If you are familiar with some sort of version control(e.g. GitHub or Bitbucket), then starting to work with GitLab will be comparatively easy.(yayyyy!!)

According to upgrad, "GitLab is more focused on offering a features-based system with a centralized, integrated platform for web developers, when compared with the popular counterpart GitHub. 

Since the team continuously brings new features, it becomes a good option if there is more continuous integration happening in your project."

Reference: https://www.upgrad.com/blog/github-vs-gitlab-difference-between-github-and-gitlab/

Starting with GitLab


It is fairly easy if you have got some unix shell scripting knowledge beforehand. 

The first step is to create a project in GitLab which will be the repository for all your files. 

The main file to be maintained is .gitlab-ci.yml, this file will have all the configuration details for your pipeline in it.

Note: YML stands for 'YAML ain't Markup Language' - a human friendly data serialization standard for all programming languages


We define jobs that GitLab should execute so as to run the project. Here is a sample code for reference:

Build

Note: we use 2 spaces for indentation and not 'Tab' key

build the car: (The actual work this section of the file is for)

  script: (Here we are telling GitLab runner that now the script part starts, which contains the shell commands required for the pipeline)

    - mkdir build
    - cd build
    - touch car.txt
    - echo "chasis" > car.txt
    - echo "engine" > car.txt
    - echo "wheels" > car.txt


Now, we can commit the changes that we have done. After that, we can see the status of the pipeline as 'commit running', as GitLab detects it as pipeline and then shows all the jobs we have defined in the configuration file. Once the commit is done, we can run the pipeline, which is executed by GitLab runner.

Test

Once we build something, we want to test it out as well, right? We can add another section of work to test the build job that we just defined. Here we go with that:


test the car:
  script:
    - test -f build/car.txt (we are checking if 'car.txt' file exists in the mentioned directory and it is a regular file)
    - cd build 
    - grep "chasis" car.txt (we are searching for the string inside the file using the famous 'grep' command)
    - grep "engine" car.txt
    - grep "wheels" car.txt


After that, we have 2 jobs ready in our pipeline i.e. the one where we build a car and second we test if we build the car without any issues! (wooohooooo!!) However, there is one more step yet to be performed, which is defining the order of the execution because, GitLab tries to execute the jobs in parallel, which may not be desired all the time like in our case for now.

So, we don't want build and test to run in parallel, but we want them to be executed in sequence. Like first the build and then the test(normal execution sequence, correct? :P)

Now, we add "stage" keyword with the stage name in our code and now we are ready for the run. :) :)


Our code now looks like this:

stages:
  - build
  - test

build the car:
  stage: build
  script:
    - mkdir build
    - cd build
    - touch car.txt
    - echo "chasis" > car.txt
    - echo "engine" > car.txt
    - echo "wheels" > car.txt

test the car:
  stage: test
  script:
    - test -f build/car.txt 
    - cd build 
    - grep "chasis" car.txt
    - grep "engine" car.txt
    - grep "wheels" car.txt


Now after that, if we run the pipeline we should get the job succeeded message! But, do we really get that message??

The answer is No!!(why??)

The reason is that every jobs in GitLab CICD is independent of each other and hence don't share anything unless specified do so!

Okay..so what do we do then?? 

We define an artifact, which is essentially a directory to hold our files which are shared across jobs.

Let's do that now and see how our code looks like and if it runs successfully or not! 

There is, however , one more slight change we need to do for that(not again haha!!)

Okay..so if you noticed we were not appending content to the file till now, instead we were overwriting the content(using '>' operator), so when the test job runs it can't find the texts other than the most recent one, which is in our case "wheels''.

So, we now make the build job append to the file every time. And, now the code looks like as below and the job runs successfully (finally :P wooohoooo!!!) 


stages:
  - build
  - test

build the car:
  stage: build
  script:
    - mkdir build
    - cd build
    - touch car.txt
    - echo "chasis" >> car.txt
    - echo "engine" >> car.txt
    - echo "wheels" >> car.txt
   artifacts:
    paths:
      - build/

test the car:
  stage: test
  script:
    - test -f build/car.txt 
    - cd build 
    - grep "chasis" car.txt
    - grep "engine" car.txt
    - grep "wheels" car.txt




We will continue the learning from here onward.

Thanks for stopping by!! We will meet in my next blog!! :)

Comments

Popular posts from this blog

Should Data Engineers care only about technical knowledge???

Yet Another(Booming) Asset Class!!!

Web 3.0!!! What is it?? Why should we learn about it??