Startup for analyzing and managing Big Software

Project preview
logo

Codescoop is a Finnish startup with a Spanish origin that provides an open-source tool for analyzing, improving, and managing Big Software.

What we did
Design
Audit
Back-end
Front-end
DevOps
Machine learning
Support
Project preview

Background

Codescoop was founded in 2017 by Virginia del Olmo and Valtteri Halla in Helsinki as a distributed team to create an enterprise support tool and improve the quality of software products from tech giants. In June 2018, thanks to the support of the Finnish investment community, the startup received the first funding within the Seed Round to deploy an affordable version of the service for private developers and IT engineers. As a result of scaling, Codescoop expanded its team with specialists from 8 countries, including FTL. In 2019 the project was temporarily suspended due to internal organizational barriers.
location
Finland
period
2017 - 2019
tech stack
Java, Go, Python, Node.js, Mongo DB, Google Big Table, Hbase, Kubernetes, Circle CI

Customer Request

Author icon
Hi. We launched a startup that allows us to identify software flaws by analyzing private and open source components. Now we want to scale it up and make it available to ordinary developers. We need a specialist for outsourcing, one or several, to implement all the features that we have planned. Might you be interested in such cooperation? If yes, how can we work as a team?
Author icon
Hi, of course, we can provide the required engineer. But you should understand the specifics of the service. We need all the technical information on the project, requirements, and tasks. How does the development process work and what tools do you use to exchange information?

Challenges

  • Simplement the fastest possible data collection by code components

    The problem was collecting data on hundreds of thousands of open-source components from various resources such as GitHub, GitLab, HackerNews, NPM. The parsers that functioned in Codescoop did not collect metadata as quickly as was required to maximize the productivity of the service, which prompted the search for better solutions.

  • Vulnerability search automation absence due to conflicts under license compliance

    Code components update for software product optimizing of Coodescoop customers typically was performed by looking for fresh releases in open sources. As a result, one of the key problems of corporate clients was the conflict over code components at the license compliance level and the slowdown in a software release.

  • The need to store more than 100 million time-series with fast search in the database

    The practice has shown that regular databases like MongoDB did not cope with the task of storing and quick searching among 100 million components. Therefore, it was necessary to search and implement more specific solutions for the storage of time series.

Challenges

Challenges

  • Optimization of work with code components when finalizing Big Software

    Finding the relevant code points based on specific criteria took a long time during the work with the variety of open-source components from Coodescoop clients. An automation tool was required to optimize this process because it would help to perform a flexible search of the necessary components in all open sources.

  • No solution to deploy a test environment

    In the conditions of constant improvement and testing of the software product, it was required to deploy the test environment regularly. Maintaining regular functioning would require a constantly working cloud infrastructure, which would significantly increase costs. Therefore, the Codescoop team needed a solution that would allow them to launch systems directly at the time of testing.

  • The complexities of simultaneously deploying cloud and on-premises infrastructure

    In the conditions of constant improvement and testing of the software product, it was required to deploy the test environment regularly. Maintaining regular functioning would require a constantly working cloud infrastructure, which would significantly increase costs. Therefore, the Codescoop team needed a solution that would allow them to launch systems directly at the time of testing.

Solutions

  • Integration of ORT into Software Intelligence system

    The implementation of the tool made it possible to automate the search for conflicts between licenses of open-source components and corporate requirements. Thanks to this, it was possible to accelerate the release of software products, minimize the risks of inconsistency with license compliance, and save corporate customers a significant part of the budget.

  • Automating Infrastructure Deployments with Terraform

    All the necessary infrastructure for raising the service was described in Terraform, which made installing on-premise solutions for large companies easy. As a result, the project team accelerated work on corporate projects several times, thanks to personal and flexible infrastructure management for urgent tasks.

  • Implementation of GBT for storing a time series of parsing results over 3 years of the product life cycle

    To store the development history of the components, we used daily time series based on the data for the last 3 years. An array of 100 thousand components represented about 100 million time series, which was later used for various ML algorithms. To speed up work with large amounts of data and improve performance under high loads, we decided to use Google Bigtable, and later - HBase.

Solutions

Solutions

  • Search and filtering system for Big Software code components

    FTL has implemented the elasticsearch tool, which enables scalable, multi-threaded searches based on various metrics and component characteristics. As a result, developers got a possibility to find relevant elements for further work with them in a couple of clicks in the process of improving the software product.

  • Optimizing the deployment of the test environment for the tasks of the project team

    To minimize costs, specialists developed an algorithm for automating the deployment of about 20 microservices and the necessary infrastructure. Thus, the test environment was active only at the time of direct work with the components and architecture, which significantly reduced the costs for the cloud infrastructure.

  • Crawler development on GO

    Considering that the project technical and software infrastructure had a throughput of 1 million components, specialists developed GO crawlers to withstand the required load. It became possible to significantly simplify and speed up the process of analyzing software product vulnerabilities.

8

countries covered by the Codescoop project

30

senior developers

3

largest business hubs supported in Finland

Main features

Development of convenient UI / UX system

The project team created a unique visualization. Thanks to it, even in the early stages of the software product life cycle, you can get a comprehensive analysis of the Big Software technical stack in diagrams and metrics. This saves a significant portion of the budget and time for managing critical operations, fixing failures, and implementing changes in the later stages of the product life cycle.

Development of convenient UI / UX system

Development of convenient UI / UX system

Definition of key metrics and clustering of components into groups

Software Intelligence not only includes cross-system data collection algorithms but also allows you to analyze vulnerabilities and propose solutions to improve the software product. The Codescoop team, together with FTL specialists, developed a unique tool based on Python, machine learning, and artificial intelligence methods to create predictions for the development of software components in the time and technical plane. With its help, Big Software developers can make timely tactical and strategic decisions to improve and change the software product depending on the requirements.

Definition of key metrics and clustering of components into groups

Definition of key metrics and clustering of components into groups

Technologies

Back-End

Java
Go
Python
Node.js

DevOps & Infrastructure

Mongo DB
Google Big Table
Hbase
Kubernetes

CI:

Circle CI

More projects