CONFIDENTIAL Designator 1 CONFIDENTIAL Designator 2 Project Lifecycle & Dependency Management Open Source Development Course Marek Čermák OpenSourceDevelopmentCourse2019 3 Project release lifecycle Releases indicate the development status and record the development progress In order for users and contributors to navigate in the project lifecycle and development status and also for the developers to be able to manage the project, versioning and releases are one of the most important things to do when developing a software. CONFIDENTIAL Designator Different people have different needs PROJECT LIFECYCLE 4 As a contributor to a project that is new to me, I always struggle to navigate in the contribution guidelines. Should/Can I even contribute? How do I contribute — is there a specific procedure? When is the right time? A contributor “” When choosing a project I want to depend on, I usually look at its development status and decide based on the maturity of the project. Concise documentation is a +. A developer “” A project becomes harder to maintain as it grows. The key to success is having a well-prepared continuous delivery and integration pipelines. As a benefit, it protects you and your team from an absolute chaos. A developer & maintainer “” CONFIDENTIAL Designator 5 PROJECT LIFECYCLE Source: https://www.python.org/dev/peps/pep-0440/ The following definition applies to a Python project, but can be easily translated to other languages: "Projects" are software components that are made available for integration. Projects include Python libraries, frameworks, scripts, plugins, applications, collections of data or other resources, and various combinations thereof. Public Python projects are typically registered on the Python Package Index. What is a “project”? CONFIDENTIAL Designator Semantic versioning PROJECT LIFECYCLE 6 The version scheme is used both to describe the distribution version provided by a particular distribution archive, as well as to place constraints on the version of dependencies needed in order to build or run the software. The canonical public version identifiers MUST comply with the following scheme: Source: https://www.python.org/dev/peps/pep-0440/ major minor pre-release post release development release CONFIDENTIAL Designator Development status (Software release lifecycle) PROJECT LIFECYCLE 7 Planning Ideation phase and defining objectives. Alpha / Beta These pre-releases are to support software testing among a limited set of users. 0.X.YaN Production/Stable Stable release used for public consumption. Also called “Final releases”. 0.X.Y Pre-Alpha These are often used in projects that release “early and often” are not meant for public consumption. 0.X.Y.devN Release Candidate Candidate for a stable (final) release which is meant for early adopters. 0.X.YrcN Mature Usually more than one release or at least one major release. X.Y 0.X.YbN CONFIDENTIAL Designator Planning PROJECT LIFECYCLE 8 You gotta know WHAT the project is for and WHO the project is for. The objectives should we well defined and understandable. Define objectives and the target audience Do NOT reinvent the wheel! If there is an existing project which is similar, see if you can use that one or contribute. Do the research These can be publication tools, installation tools and other automation tools used for the development and delivery. [if not defined] Determine the tools and delivery strategy Come up with a POC Prove the concept. This accounts for a feasibility study as well. CONFIDENTIAL Designator Pre-alpha refers to all activities performed during the software project before formal testing PRE-ALPHA 9 These activities can include requirements analysis, software design, software development, and unit testing. Software design From a concept, through the architecture and implementation details. Software development Includes programming, feature implementation, feature enhancements, bug fixes or maintenance (i.e. updates and migrations, etc…) Unit & Integration testing Check whether the individual units of source codes function as expected based on a set of determined rules and whether they fit together and function together. Source: https://en.wikipedia.org/wiki/Software_release_life_cycle CONFIDENTIAL Designator Pre-releases refer to a set of version identifiers which denote a preparation for the final release and are meant for early adopters ALPHA / BETA / RC 10 Among the pre-releases we include the alpha/beta releases and release candidates Alpha The first phase of software testing before releasing it to customers / users. In proprietary software, it is not common for a package in alpha release to be generally available. Alpha usually ends with a feature freeze. Beta The software is expected to have bugs which do not directly affect its functionality. The main purpose is to reduce impact on customers / users or to demonstrate and preview a product. A commercial betaware is usually available to limited set of users outside of the organization (closed beta) or publicly (open beta). Release Candidates A beta version with the potential to become the final product ready to be released. Minor fixes to fix certain defects are expected but NO new features or API changes should be made. Source: https://en.wikipedia.org/wiki/Software_release_life_cycle CONFIDENTIAL Designator Release (stable, or final release) indicates that the software is stable, tested and ready to be used RELEASE 11 Good to go. General Availability (GA) Used mostly for commercial products, but occasionally can be seen in the Open Source world as well. The GA means that the software is available for purchase. Support A release should be supported for a certain period of time and in further releases, there should be guarantee of certain backwards compatibility (this is not a rule, but is greatly appreciated). Source: https://en.wikipedia.org/wiki/Software_release_life_cycle CONFIDENTIAL Designator So, when to contribute and when to file an issue? When should I NOT use the project yet? 12 If a project is well-maintained, it is easy to spot the development status at the first glance. Source: https://github.com/kubernetes/kubernetes PROJECT LIFECYCLE CONFIDENTIAL Designator So, when to contribute and when to file an issue? When should I NOT use the project yet? PROJECT LIFECYCLE 13 If a project is well-maintained, it is easy to spot the development status at the first glance. Source: https://github.com/kubernetes/kubernetes CONFIDENTIAL Designator So, when to contribute and when to file an issue? When should I NOT use the project yet? PROJECT LIFECYCLE 14 If a project is well-maintained, it is easy to spot the development status at the first glance. 15 CONTRIBUTION: GUIDELINES Contribution Guidelines Guidelines communicate how people should contribute to you project. Contribution guidelines are a set of recommended practices, or sometimes even required ones, established by a maintainer for the contributors to be followed. CONFIDENTIAL Designator Before contributing to an open source project, make sure to check its contribution guidelines! CONTRIBUTION: GUIDELINES 16 Usually, they can be found in a file called CONTRIBUTION.md[0] or, in case of GitHub, they might be integrated to PRs and Issues directly[1] Verification for both contributors and developers For both contributors and developers, the guidelines help them verify that they're submitting well-formed pull requests and opening useful issues. Getting started for contributors Contributors might struggle to navigate in the project or they may not know where to start contributing, what should the PR or issue look like. Prevent confusion and save time For both owners and contributors, contribution guidelines save time and hassle caused by improperly created pull requests or issues that have to be rejected and re-submitted. Source: [0]: https://github.com/atom/atom/blob/master/CONTRIBUTING.md [1]: https://help.github.com/en/articles/setting-guidelines-for-repository-contributors CONFIDENTIAL DesignatorCONTRIBUTION: GUIDELINES 17 Source: https://help.github.com/assets/images/help/pull_requests/contributing-guidelines.png CONFIDENTIAL Designator As for the WHAT to contribute … It doesn’t have to be code CONTRIBUTION: GUIDELINES 18 Most people don’t know that I actually don’t do any real work on the CocoaPods tool itself. My time on the project is mostly spent doing things like documentation and working on branding. @orta “” I first reached out to the Python development team (aka python-dev) when I emailed the mailing list on June 17, 2002 about accepting my patch. I quickly caught the open source bug, and decided to start curating email digests for the group. They gave me a great excuse to ask for clarifications about a topic, but more critically I was able to notice when someone pointed out something that needed fixing. @brettcannon “” Source: https://opensource.guide/how-to-contribute/ CONFIDENTIAL DesignatorCONTRIBUTION: BEST PRACTICES 19 Okay, I read the contribution guidelines. What now? How do I proceed? What are the best practices? “” Ready-to-contribute pessimist Jerry CONFIDENTIAL Designator 20 FORK the repository A fork is a copy of a repository. Forking a repository allows you to freely experiment with changes without affecting the original project. Create a branch for the specific purpose `git checkout -b fix-readme-typo` Iterate on the issue Not all fixes are so simple that they can fit into one commit. `git clone` Clone the repository to your local machine. Commit your changes `git commit -a --sign` Push to the remote and you’re ready to create a PR against the upstream. It all starts with a FORK ... CONTRIBUTION: BEST PRACTICES Pull Request `git push` CONFIDENTIAL Designator 21 FORK the repository A fork is a copy of a repository. Forking a repository allows you to freely experiment with changes without affecting the original project. Create a branch for the specific purpose `git checkout -b fix-readme-typo` Iterate on the issue Not all fixes are so simple that they can fit into one commit. `git clone` Clone the repository to your local machine. Commit your changes `git commit -a --sign` Push to the remote and you’re ready to create a PR against the upstream. Can you spot an ISSUE? CONTRIBUTION: BEST PRACTICES Pull Request `git push` CONFIDENTIAL Designator 22 FORK the repository A fork is a copy of a repository. Forking a repository allows you to freely experiment with changes without affecting the original project. Create a branch for the specific purpose `git checkout -b fix-readme-typo` Iterate on the issue Not all fixes are so simple that they can fit into one commit. `git clone` Clone the repository to your local machine. Commit your changes `git commit -a --sign` Push to the remote and you’re ready to create a PR against the upstream. Can you spot an ISSUE? CONTRIBUTION: BEST PRACTICES Pull Request `git push` Potentially lots of commits! CONFIDENTIAL Designator 23 FORK the repository A fork is a copy of a repository. Forking a repository allows you to freely experiment with changes without affecting the original project. Create a branch for the specific purpose `git checkout -b fix-readme-typo` Iterate on the issue Not all fixes are so simple that they can fit into one commit. `git clone` Clone the repository to your local machine. Commit your changes `git commit -a --sign` Squash unnecessary commits, like minor fixes and typos and push them to the remote. Let’s squash it! CONTRIBUTION: BEST PRACTICES Pull Request `git rebase -i ` `git push` CONFIDENTIAL Designator 24 FORK the repository A fork is a copy of a repository. Forking a repository allows you to freely experiment with changes without affecting the original project. Create a branch for the specific purpose `git checkout -b fix-readme-typo` Iterate on the issue Not all fixes are so simple that they can fit into one commit. `git clone` Clone the repository to your local machine. Commit your changes `git commit -a --sign` Squash unnecessary commits, like minor fixes and typos and push them to the remote. Can you spot an ISSUE? CONTRIBUTION: BEST PRACTICES Pull Request `git rebase -i ` `git push` CONFIDENTIAL Designator 25 FORK the repository A fork is a copy of a repository. Forking a repository allows you to freely experiment with changes without affecting the original project. Create a branch for the specific purpose `git checkout -b fix-readme-typo` Iterate on the issue Not all fixes are so simple that they can fit into one commit. `git clone` Clone the repository to your local machine. Commit your changes `git commit -a --sign` Rebase to the upstream branch to make sure there are no merge conflicts, squash unnecessary commits and push to the remote. Pull, squash and push … sounds weird, but does wonders! CONTRIBUTION: BEST PRACTICES Pull Request `git pull upstream master --rebase` `git rebase -i ` `git push` CONFIDENTIAL Designator There are other useful practices to follow when contributing to the upstream 26 Document WHAT and WHY Documenting why the changes have been made and what lead to the decisions you’d made will save the reviewer’s time and will increase the chances of your PR being merged. Use provided doc generators, if possible. Follow the code style When contributing code, it is good practice to adapt to the project code style (especially for languages with fluid code styles, like JS). Use provided formatters, if possible. Run tests before submitting the PR and write new ones when introducing new features. CONTRIBUTION: BEST PRACTICES Be extremely careful if changing project dependencies (see further) 27 PROJECT DEPENDENCIES Software dependencies Dependencies are the hell for maintainers, a blessing for developers and the heaven for attackers. A dependency is additional code that you want to call from your program. Adding a dependency avoids repeating work already done: designing, writing, testing, debugging, and maintaining a specific unit of code Source: https://research.swtch.com/deps CONFIDENTIAL Designator Kinds of dependencies 28 Direct dependencies Libraries that your code depends upon. These require some effort to control but comparing to the others they are sort of manageable. Transitive dependencies Dependencies of the dependencies. Usually quite hard to control. Third party dependencies A special kind. These are the dependencies that you don’t own and that are not part of your organization. Especially hard to control. PROJECT DEPENDENCIES CONFIDENTIAL DesignatorPROJECT DEPENDENCIES 29 Transitive dependencies Source: https://cermakm.shinyapps.io/Tensorflow_Transitive_Dependencies/ CONFIDENTIAL Designator What could go wrong? PROJECT DEPENDENCIES 30 A package is code you download from the internet. Adding a package as a dependency outsources the work of developing that code—designing, writing, testing, debugging, and maintaining—to someone else on the internet, someone you often don’t know. By using that code, you are exposing your own program to all the failures and flaws in the dependency. Your program’s execution now literally depends on code downloaded from this stranger on the internet. Source: https://research.swtch.com/deps CONFIDENTIAL Designator What could go wrong? You name it ... PROJECT DEPENDENCIES 31 Security vulnerability (CVE) Version conflict API Changes License conflict Missing/Removed dependency Broken third-party dependency CONFIDENTIAL Designator It sounds unsafe ... PROJECT DEPENDENCIES 32 And it is… but it is also necessary to keep the wheel of Open Source spinning! CONFIDENTIAL Designator A note about the security vulnerabilities PROJECT DEPENDENCIES 33 What is a "Vulnerability?" An information security "vulnerability" is a mistake in software that can be directly used by a hacker to gain access to a system or network. What is an "Exposure?" An information security exposure is a mistake in software that allows access to information or capabilities that can be used by a hacker as a stepping-stone into a system or network. What is CVE? CVE is a list of information security vulnerabilities and exposures that aims to provide common names for publicly known problems. The goal of CVE is to make it easier to share data across separate vulnerability capabilities (tools, repositories, and services) with this "common enumeration." Please visit http://cve.mitre.org/about/faqs.html for more information Source: https://www.cvedetails.com/cve-help.php CONFIDENTIAL Designator Common vulnerabilities according to the NVD PROJECT DEPENDENCIES 34 Source: https://www.cvedetails.com/vulnerabilities-by-types.php CONFIDENTIAL Designator Beware the “dependency hell” PROJECT DEPENDENCIES 35 Especially when working with complex systems which have a lot of dependencies, it might be incredibly difficult to find the “right” combination of versions which are actually compatible together. Sometimes, we might actually reach sort of a “deadlock” state if one dependency requires a version of another which is in fact not compatible with the rest of the project, i.e.: A requires Da && Da requires X == 1.13 A requires Db && Db requires X == 1.13.5 Will it break, or not? CONFIDENTIAL DesignatorPROJECT DEPENDENCIES 36 It might become tedious ... Source: https://cermakm.shinyapps.io/Tensorflow_Transitive_Dependencies/ CONFIDENTIAL DesignatorPROJECT DEPENDENCIES 37 Can you guess the issue? Source: https://cermakm.shinyapps.io/Tensorflow_Transitive_Dependencies/ HINT: centrality, weights CONFIDENTIAL DesignatorPROJECT DEPENDENCIES 38 Can you guess the issue? Source: https://cermakm.shinyapps.io/Tensorflow_Transitive_Dependencies/ A dependency can actually have a greater centrality and thus be “more important”! CONFIDENTIAL Designator So… what can we do? PROJECT DEPENDENCIES 39 CONFIDENTIAL Designator 40 Source: https://miro.medium.com/max/3000/1*41XiwBL9NXDfGtIXbc3UsQ.jpeg CONFIDENTIAL Designator Good practices when managing dependencies PROJECT DEPENDENCIES 41 Choose a compatible and secure versionConsider the value of adding the dependency If introducing the dependency means a few lines of code that you’re spared of, do NOT introduce the dependency at all. It is not worth it. Keep your dependencies up to date Update the dependencies and keep the code you own up to date with them. Do not rely on the pinned down version. Regularly watch for CVEs and consult the NVD Do NOT expose your application. GitHub and specialized software exist to inform you about potential security risks of your application. Consider the impact of the dependency Consider how important the dependency is to your application and treat the dependency accordingly. Unit TESTS & integration TESTS! Write unit tests and integration tests especially for functions using a code that you don’t own! Take the time and investigate. Choose a version which is CVE free and is compatible with the rest of the application. CONFIDENTIAL Designator And don’t ever forget ... PROJECT DEPENDENCIES 42 TO MAKE SURE THAT THE LICENSES ARE COMPATIBLE! CONFIDENTIAL Designator The compatibility is sometimes tricky ... PROJECT DEPENDENCIES 43 Source: https://www.slideshare.net/SamsungOSG/guide-to-open-source-compliance 44 CI/CD Continuous { Integration, Delivery, Deployment } Introduction to CI/CD CI/CD are the acronyms that are often mentioned when people talk about modern development practices.[0] Source: [0] https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment CONFIDENTIAL Designator CI/CD is a set of practices which have a significant impact to the way new releases are delivered and maintained. CI/CD 45 These are the three main practices to be familiar with. Continuous Integration Change validation by creating a build and running automated tests against the build. By doing so, you avoid the integration hell that usually happens when people wait for release day to merge their changes into the release branch. Continuous Delivery An extension of continuous integration to make sure that you can release new changes to your customers quickly in a sustainable way. This means that on top of having automated your testing, you also have automated your release process Continuous Deployment Continuous deployment goes one step further than continuous delivery. With this practice, every change that passes all stages of your production pipeline is released to your customers. There's no human intervention, and only a failed test will prevent a new change to be deployed to production Source: [0] https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment CONFIDENTIAL Designator 46 Q&A OpenSourceDevelopmentCourse2019 CONFIDENTIAL Designator https://github.com/CermakM https://www.linkedin.com/in/ai -mcermak/ https://twitter.com/Marc_Cer mak https://www.facebook.com/mace rmak THANK YOU FINALLYOVER.. CONFIDENTIAL Designator