Flow’s Git metadata collector (GMC) is client-side software which sends commit-related metadata to Pluralsight Flow from your git servers. This ensures source code never leaves your environment and you are still able to take advantage of Flow’s insights and metrics.
The GMC is an optional add-on to your Flow plan. Please reach out to your Flow contact for information on how to get started with a GMC installation.
Note: If you plan to use the GMC inside a single-tenant cloud environment, there are additional prerequisites you must meet before installing the GMC.
In this article
What does the GMC installer do?
The GMC installer is an Ansible installer which:
- Installs dependencies
ContainerD
Nerdctl
Pip
-
Python3
Note: If Python is already installed on the host, ensure it's version 3.7 or above. If not, update the version to 3.7 or above before running the installer. The Flow GMC is not compatible with earlier Python versions. If Python is not installed on the host, the installer will install a compatible version for you.
- Creates a user for runtime, unless a user is specified at the time of installation
- Creates a repo cache folder where cloned copies of repos are stored.
- This location is configurable. Default:
/var/lib/flow/agent/repoCache
- This location is configurable. Default:
- Creates a configuration directory where the GMC configuration file is stored
- This location is configurable. Default:
/etc/flow/agent/config
- This location is configurable. Default:
- Imports the GMC container image
- Creates a cron job to run the image at regular intervals.
- This interval is configurable. The default interval is 3 hours and can be increased as needed based on how much data you have.
System requirements
To successfully install and run the GMC with the GMC installer, your system must meet the following requirements.
Network requirements
The host must have:
- Access to the git repositories you want to connect to.
- Egress configured to access the Flow API gateway, as listed in the downloaded config file
- The link for multi-tenant cloud is
https://flow-git-agent-api.pluralsight.com
. If you’re on a single tenant instance, the link will be custom to your instance.
- The link for multi-tenant cloud is
- Internet access at the time of installation to successfully locate updates.
Compute resource recommendations
- CPU: 4 processor cores
- Memory: 2 GB
These recommendations will fit most customer needs, but requirements may vary depending on the number of workers and commit threads in your configuration.
Storage requirements
- The sum of the total size of all of your git repositories multiplied by at least 1.3 to account for growth.
- 2 GB for log files
- File system must support
fsync
. Please avoid NFS mounts.
Operating system
The Flow GMC can be run on compatible Linux operating systems, including:
- Debian 11, 12
- Ubuntu 20.04, 22.04, 22.10, 23.04
- CentOS 8 (stream), 9 (stream)
- RHEL 8.6, 8.7, 8.8, 8.9, 8.10, 9.0, 9.1, 9.2, 9.3
GMC vendor and authentication requirements
In Flow, you can set up integrations using the GMC for the following vendors. You must use one of the following authentication methods in the GMC to ingest git data:
- GitHub and GitHub Enterprise Server
- Access token
- GitLab and GitLab self-hosted
- Access token
- BitBucket Server
- Access token
- Username/password
- BitBucket
- Username/password
- Azure DevOps Services
- Access token
- Azure DevOps Server(TFS)
- Access token
Note: Only git data is collected through the GMC. Project lists, PR data, and ticket data are not collected through the GMC, and are instead collected through the authentication method provided in Flow. If you don’t intend to ingest PR or ticket data, you can use the Project discovery push API to send project lists to Flow.
GMC versioning and supported versions
Flow supports N-2 minor versions of the GMC. Unless otherwise specified, we recommend updating to the latest version, which will contain the most up-to-date patches.
GMC installation with the GMC installer
Before running the installer, create all your git integrations in Flow. This will ensure your GMC config file is populated with all the correct information for each integration. If you add additional integrations after you download the config file, you must redownload the file and update it in your system.
Note: When creating integrations, you can use any of the available integration methods, including OAuth. The authentication requirements for the GMC itself are separate from the authentication options in Flow. The Flow authentication methods are used to collect project lists, PR data, and ticket data as needed. If using a different authentication method or token for the GMC and Flow, please ensure that both have access and permissions to the same repositories to avoid failures when retrieving data.
- After you’ve created your integrations, go to the Integrations page and click Download metadata collector assets. In the modal, choose the installer option if prompted.
- Next, click Download GMC config file to access the file containing all integration and authentication information to add to the GMC.
- Click Generate installer link, then click Copy link. For security reasons, this link expires after one hour, so make sure you’re ready to download the installer inside your host environment. Otherwise, return to complete this step when you’re ready.
- Create an API key in Flow, using a Flow user service account if possible.
- Download the GMC installer inside your host environment using the installer link. Then run
curl -o flow-ingestion-agent.tar.gz "{installer link}"
. - Extract the
tar
file by runningtar -xzvf flow-ingestion-agent.tar.gz
. - Execute the GMC by running the installer file as
sudo
on{executable path}/flow-ingestion-agent-install.sh
. As an example, this could look likesudo ./flow-ingestion-agent-install.sh
. The install script must be run assudo
. - Put the GMC config file in
/etc/flow/agent/config
. - Update your GMC config file to include your authentication values.
- Add your API key to the GMC config file so it can access the Flow API gateway.
At this point, your GMC installation should be complete and your data should begin ingesting and processing. If you add any integrations in the future or want to modify your credentials, download the GMC config file from Flow again and replace your current config file with the new one. Alternatively, edit the config file you’re already using.
Note: All logs are stored in var/log/flow
. Run crontab -l
to see the GMC cron job.