From c1657fe3c3e299b8b490b550a123baae841dfb4c Mon Sep 17 00:00:00 2001 From: "Uwe Jandt (DESY)" <uwe.jandt@desy.de> Date: Tue, 8 Dec 2020 13:39:06 +0100 Subject: [PATCH] two SCC survey posts: corrected dates, coding background --- .../2020/10/2020-10-15-survey-technology.md | 205 ------------------ .../2020-11-07-survey-results-language-vcs.md | 149 ------------- 2 files changed, 354 deletions(-) delete mode 100644 _posts/2020/10/2020-10-15-survey-technology.md delete mode 100644 _posts/2020/11/2020-11-07-survey-results-language-vcs.md diff --git a/_posts/2020/10/2020-10-15-survey-technology.md b/_posts/2020/10/2020-10-15-survey-technology.md deleted file mode 100644 index 8cdda38c3..000000000 --- a/_posts/2020/10/2020-10-15-survey-technology.md +++ /dev/null @@ -1,205 +0,0 @@ ---- -title: "HIFIS Survey 2020: A Technology Perspective" -date: 2020-11-27 -authors: - - huste - - hueser -layout: blogpost -title_image: default -categories: - - report -tags: - - survey - - technology -excerpt: > - The HIFIS Software survey gathered information from Helmholtz - research groups about their development practice. This post shows some - insights from a technology perspective and tries to make some conclusions - for the future direction of HIFIS Software technology services. ---- - -Beginning of 2020 the HIFIS Software team initiated a software survey -targeting employees of the whole Helmholtz Association in which 467 participants -could be considered for the analysis. -The figure below depicts how strongly the different Helmholtz research fields -are represented in this survey. - -{:.treat-as-figure} - - -With the results of the survey we want to understand, how we as HIFIS Software -Services can best support your every day life as a research software developer. -In this blog post we will examine the results from a technology perspective -and will on the one hand give an overview of the status quo of the software -engineering process of the participants, and on the other hand try to identify -specific measures. - -## Version Control - -One of the basic requirements for developing sustainable and high-quality -research software is the usage of a version control system (VCS). -On the market there exist multiple competitors, distributed version control -systems like Git or Mercurial and centralized version control systems like -SVN. -In accordance with the trends shown in analysis done by Stackoverflow, we -expected Git to be the most popular tool within Helmholtz. - -{:.treat-as-figure} - -Trend of Stackoverflow questions per month. Created via [Stackoverflow Trends](https://insights.stackoverflow.com/trends) -on 2020-10-15. - -The participants of the survey have answered to the multiple-choice question -about which VCSs they use as shown in the figure below. - -{:.treat-as-figure} - - -A similar diagram as above has already been evaluated in a related -[blog post on results from the survey analysis]({% post_url 2020/11/2020-11-07-survey-results-language-vcs %}). -Here, based on these descriptions we only would like to draw conclusions -from a technological point of view. -Only roughly 10% of the participants claim that they do not use VCSs -while developing their research software. -These results indicate that the awareness is high among the participants -that the usage of version control systems is an important aspect in -sustainable software development. - -In order to unravel that a bit more, we identified a trend in the figure below -that the use of VCSs increase the wider research software developers share -their source code in terms of categories like within their research group, -research organization, research field or even general public. -Hence, there might be a relationship between the broadness of code -share and usage of VCSs. -If this trend holds true then it illustrates that version control -systems are indeed mandatory tools to collaborate with other -developers. - -{:.treat-as-figure} - - -The responses to the survey are then grouped into the six Helmholtz research -fields: - -* Aeronautics, Space and Transport -* Energy -* Earth and Environment -* Health -* Matter -* Key Technologies - -{:.treat-as-figure} - - -In the research field _Aeronautics, Space and Transport_ SVN seems to be -more widely spread compared to other research fields but also the portion -of developers who do not use version control is lowest among the -participants of this research field. -On the one hand, given the collected data about the amount of VCSs questions -asked on Stackoverflow over time introduced earlier this most probably gives an -indication that there is a significant amount of comparably older repositories -that use SVN and that this research field might have a longer tradition of -using VCSs. -On the other hand, this shows that the use of VCSs in this research -field today is more prevalent compared to other Helmholtz research fields. - -From the data it is also possible to compare the usage of version control -systems with the team size participants usually develop software in. -The result is shown in the figure below: - -{:.treat-as-figure} - - -It is clearly visible that the amount of participants who claim to not use any -kind of version control decreases with increasing team size. -This insight is actually very valuable. -This illustration suggests a relationship between team size and the use of VCSs. -One reason for increasing use of VCSs with growing team size might be that VCSs -make collaboration more comfortable and that researchers are aware of this fact. -Whether the use of VCSs has actually already become a de-facto standard in -research software will be further investigated (e.g. in our next survey). - -On the other hand from the participants who claim to develop software mostly -on their own 20% specify to not use version control at all. -This is something we as HIFIS Software Services would like to see change in -the future. -For us, it is important to make people aware that using version control is a -mandatory requirement for software development projects of any scale. -This requires us to make the entry hurdle to using version control systems as -low as possible. -This means that every software developer in Helmholtz must have -access to a suitable and easy-to-use infrastructure to enable this basic -requirement. -Therefore, HIFIS Software Services will offer a GitLab instance that is -usable by every employee of the Helmholtz Association free of charge. - -## Software Development Platforms - -Using version control systems can be considered the entry-point to a world of -platforms that build even more around this basic requirement. -Even if you can typically use a version control system completely local -as well, it really starts paying off when combining version control with online -platforms like e.g. GitLab, GitHub or Bitbucket. -On the one hand this opens up your project for collaboration but also gives -you access to a whole ecosystem of other extremely useful tools like issue -tracking, merge requests, CI/CD or code reviews. -This is why we were also eager to know which software development platforms -the participants use in their every-day life. - -{:.treat-as-figure} - - -The results show that among the participants the most widely used platforms -are GitHub.com and self-hosted GitLab instances followed by GitLab.com. -Thus, about 54% of the participants claim to use GitHub.com, 49% use self-hosted -GitLab instances and about 25% of the participants specify to use GitLab.com. -About 13% claim to not use any of the platforms. -This value is in a similar range to the participants who specified to not use -version control systems. - -## Continuous Integration - -Continuous Integration (CI) is referred to as the practice of merging code -changes into a shared mainline several times a day. -A typical workflow would incorporate the automatic building of a software, -the automatic execution of unit tests and finally, the automatic deployment of -artifacts, e.g the documentation or compiled binaries. -The last step is also referred to as Continuous Deployment (CD). -On the market, there exist multiple tools that support this kind of software -development process. -Some of the tools available at the time of this survey were GitLab CI, Jenkins, -Travis or CircleCI. - -The results of the survey show a pretty diverse situation for the usage of CI -services by the participants. - -{:.treat-as-figure} - - -On the one hand, a portion of 53% of the participants claim to not use CI -services at all. -Among the participants who declared to use CI services, the most commonly used -technologies were GitLab CI (29%), Jenkins (16%) and Travis CI (13%). -Due to the fact that many Helmholtz centers host their own GitLab instances -which also allows to use GitLab CI, we expected GitLab CI to be the most -popular tool among the participants of the survey. -Jenkins is also a tool that can be self-hosted and thus, is also popular and -available in different centers. -Due to the popularity of GitHub, especially for Open Source projects, -it is not surprising that also Travis CI is widely chosen according -to the survey responses. -At the time of creating the survey, GitHub Actions was not yet widely available -on the market. -This explains, why this service does not show up in the list of chosen tools. - -We as HIFIS Software Services would like to see a rise in the overall usage -of CI/CD in the daily software development process. -It offers the chance to automate repeating tasks and introduces automated -quality checks for code changes before they get merged into the mainline. -Therefore, we want to ensure that every Helmholtz researcher regardless of -their affiliation has seamless access to general purpose resources for CI/CD. -This is why the provided GitLab instance will be equipped with scalable -resources for CI/CD. -With this offer, in combination with proper education, training and -consultation we hope to see a rise of the general usage of automation -technologies in research software engineering. diff --git a/_posts/2020/11/2020-11-07-survey-results-language-vcs.md b/_posts/2020/11/2020-11-07-survey-results-language-vcs.md deleted file mode 100644 index e93c5527e..000000000 --- a/_posts/2020/11/2020-11-07-survey-results-language-vcs.md +++ /dev/null @@ -1,149 +0,0 @@ ---- -layout: blogpost -title: "HIFIS Survey 2020: Programming, CI and VCS" -date: 2020-11-27 -authors: - - erxleben -title_image: default -categories: - - report ---- - -## Introduction -In the beginning of 2020 the HIFIS team conducted a survey among Helmholtz -scientists with the goals of learning more about the current practices -concerning research software development and identifying future challenges. - -This blog post will present a glimpse into the survey's results and our take -on the gathered data. -Specifically, we will take a look at the distribution of programming languages -across the different research fields as well as the utilization of -_Version Control Systems_ (VCS) in the same context. -Last, a short insight into the prevalence of various -_Continuous Integration_ (CI) systems will be given to round out this blog -post. - -## Programming Languages - -We asked the survey participants which programming languages they regularly -used for writing research software. -The following heatmap displays the relative usage of the most predominant programming languages for each research field - -{:.treat-as-figure} - - -All presented numbers are the relative usage of a given language in a given -field. -They might not always add up to exactly 1.00 per field or per language due to -multiple factors: - -* Some participants did not answer both questions. - These answers are not represented in the plot. -* Languages that had not at least a _5%_ share in at least one field were - omitted to focus on the most prominent ones and make the graphic easier to - read. - -### What can We Learn? - -The first thing that catches the eye is that Python seems to be very dominant -in every research field. -We have to take this appearance with a slight grain of salt since the survey did -not distinguish between the outdated, but generally popular, Python 2 and -the current Python 3. -The popularity of the language amongst researchers is not very surprising: -They are well suited for quickly creating small scale scripts, combined with -an extensive choice of libraries for many use cases. - -Consequently, our education and training efforts will continue to provide -offers regarding programming in Python and create appropriate courses and -materials to further the knowledge and best practices in this language amongst -scientists and research software developers. - -Regarding consultations we expect the team to receive requests regarding the -porting of older Python 2 applications to Python 3, as well as support -requests for dealing with the variance of virtual environments and package -management for this language. - -A second language often selected was C++ which often is a popular choice in -high performance computing and larger applications. - -This indicates a potential demand for supporting this language in the future as -well, especially in the context of training as well as consulting. - -Notable further mentions would be the the strong presence of the statistics -language R in the _Health_ and _Earth and Environment_ research fields, -which implies the opportunity for education and consulting being tailored and -advertised more towards these areas. - -## Version Control systems - -Similarly to the question above, a second question was analyzed, concerning the -usage of _Version Control Systems_ (VCS) amongst the participants related to -specific fields of research. - -{:.treat-as-figure} - - -The strong prevalence of Git is apparent at first glance. -As a runner-up there are still some projects out there based on SVN for -version control, which - together with a few mentions of CVS - might be an -indicator for older, longer living projects. -The amount of projects not using any version control at all is comparatively -low, which points toward the usage of VCS being an established step in setting -up projects across all research fields. - -From an education perspective it appears to be the right way to continue to -focus on basic and advanced Git-courses and promote version control as one of -the standard practices in every scientists toolbox. -It can be expected that the consulting team might face requests for help with -migrating projects from SVN or CVS to Git in the future. - -## Continuous Integration - -As a third question we wanted to know which _Continuous Integration_ (CI) -services the participants use to automate tasks surrounding their projects. -This, again, was a multiple choice question and the following plot shows the -relative distribution of the given answers: - -{:.treat-as-figure} - - -One very prominent outcome is that over half of the participants did claim to -not use any CI at all. -Several possible reasons for this finding come to mind: -* The question was not clear enough and participants who actually use CI were - not aware of that fact. -* Participants are not aware that CI exists. -* Participants do not see any potential benefit of CI for their projects. -* Participants do not know how to set up and use CI. - -Given that practically any project can benefit from employing -_Continuous Integration_ services by automating at least the mundane management -tasks like license checking, documentation generation, style checks, etc. all -four given reasons can be assumed to be a lack in awareness and education. - -Further, the plot reveals that the currently used CI solutions are (in -descending order of percentage) _GitLab CI_ which holds over a quarter of all -shares, _Jenkins_ and _Travis CI_ with all other services being barely -represented. - -Building on the insights from this analysis, three actions clearly stand out to -improve CI usage across all projects: -* The education team will have to increase their portfolio and offer more - courses centered around CI usage. -* The popularity of _GitLab CI_ will likely increase the demand to migrate - other projects to this system. It will fall to the consulting branch to be - prepared to deal with such requests. -* The technology team has already begun to offer pre-made recipes for CI - pipelines and has an incentive to grow the collection of ready-to-use solutions - for popular scenarios. - -## Conclusion - -Thanks to the participants of the HIFIS survey in 2020 it was possible to gain -a first glimpse into the status quo of research software engineering within the -Helmholtz centers. With this data, the needs of the scientists could be assessed -from a birds-eye perspective and it is possible to determine concrete steps to -offer better support for the scientists at Helmholtz. - - -- GitLab