The Importance of Version Control, Git and GitHub
by Embedded Office (comments: 0)
Version control for embedded software: what you need to know
Version control has long been part of computer software development and is now becoming increasingly important in embedded code too. The quest to automate it has seen productivity tools such as Git and GitHub expand in usage and popularity. Here, we look at some of the reasons for this growth, the background to it and the factors to consider when deploying Git in embedded software environments.
Any group of files or directories is subject to changes and updates, including software code, technical documentation or release notes. Version control allows the tracking and comparison of changes, as well as the recovery of previous states if necessary. Its detractors may question payback periods and staff training time, while one-man bands even argue that they are the sole authority in an organisation and ‘have it all in their head’. In contrast, such arguments might well make a software development or IT security professional shudder.
Nowadays, top organisations insist on best practice as a matter of policy; version control is mandatory for mission-critical software such as avionics, medical or industrial control devices. Additionally, some companies have faced legal liabilities for product defects caused by a lack of version management. The benefits of robust systems are, therefore, clear.
The simplest form of version management involves setting up new directories for each update. However, even with automatic date and time stamping of directory names, these rudimentary systems are notoriously error prone. It can be so easy to lose track of which directory is currently in use and, consequently, overwrite files with incorrect versions. Are there many computer users who have not, at some time, experienced exasperation doing just this?
As a result, a more organised and efficient solution became necessary.
Developers collaborate with other developers. For this reason, teams initially adopted centralised systems with single servers for version information, with file checkouts from a central registry. Though relatively easy to administer, such centralised systems are a focal point of failure and, therefore, a business risk.
Due to the above limitation, distributed control systems became more popular and featured mirrored repositories in each remote device. If any server failed, the cloned client repositories could be copied back to the server on its reinstatement. In effect, the remote copies were backups. When the Linux community developed Git as shareware tool in 2005, the design followed all these principles and aimed to deliver simplicity, scalability and support for nonlinear parallel branches in organisational structures.
Git stores data about files as a stream of snapshots. Its file system has powerful tools that are available locally without a permanent link to the central database – for example, if the user is temporarily offline while travelling. The system always detects changes and checks integrity using a forty-character hash key checksum. Additionally, it supports procedural compliance by providing a single incontrovertible history of project documents.
Control of the Git repository and access is through command line type instructions.
GitHub is a management tool that makes Git easier to use. Its visual interface connects version control projects to the web so other users can read projects, spot errors and suggest fixes. The ability to study and build on other people’s contributions makes GitHub ideal for knowledge workers; many users store programs and project code, but nearly any kind of file is possible.
Notably, GitHub is not so much a development tool as a technical social network where an individual sets up a profile and then connects with or follows other accounts. Usefully, under the terms of service, GitHub supports users’ retention of ownership of their projects; there is no automatic concession of intellectual property rights.
These advantages have led to Git becoming a fundamental part of DevOps. According to the supplier website, GitHub is now in use in over 100,000 organisations worldwide including Airbnb, Bloomberg, IBM, PayPal, SAP and Spotify. Vendor information mentions a low price tag, ease of set up and flexibility in customising workflows.
In addition, developers can use the interface to discuss ideas and have chat conversations next to the code – it is not necessary to switch tools to review, share input or widen conversations within a team. Git Hub may also be useful for project staff who come from outside software development teams and who do not find Git command lines easy to use.
Other advantages include the ability to search across projects to find and reuse code, to develop security policies according to a team’s needs and to always know who authored code or pushed a commit button. Workflow compliance is now automatic and data branches enjoy protection from unintentional changes.
The solution is scalable from two seats through to a large organisation, where team managers can monitor how developers are working together through audit data and dashboards.
Increasingly prominent in the embedded world, Git and GitHub enable the efficient management of code versions. However, some extra security considerations are necessary to help prevent DDOS (distributed denial of service) or other attacks.
Git was not designed to be a security tool; it does little more than check verified history. Neither is there file permission nor an authentication sub-system. Here, Git management through GitHub comes in, with additional security and control of file or directory access. Some options available via github.com add security to repositories or branches, while others provide customisation at a more granular level. GitHub Enterprise offers enterprise-grade security and there are choices of private cloud or local company server hosting.
Binary files tend not to be small, nor are they the easiest type to track. On large projects in the embedded industry, therefore, Git's architecture becomes a limiting factor. Known as Git sprawl, the cloning or downloading of repositories may lengthen and cause slow system responses on bigger projects – from several seconds to minutes in extreme cases.
So, what are the options? As Git repositories cannot be split up, Git management tools help to make them more usable. Perforce Helix, for instance, permits their division into smaller chunks in hybrid environments.
Additionally, on bigger projects, an extension known as Git LFS (Large File Storage) speeds up response times by replacing these sizeable files with text pointers in the relevant repositories. A different server then stores the voluminous data, while products such as Bitbucket enable its efficient administration – even though there is no longer one ‘master’ point of reference.
Ever more important in modern embedded software development, version control is essential in safety-critical applications and industries. Version management tools such as GitHub extensions have changed considerably over recent years, with increasingly powerful features to manage collaboration, automation and security. Finally, there are now more options to adapt and fine-tune the solutions to get the best from them in embedded code development environments.
For an automatic notification on new blog articles, just register your EMail address.