by LINE Engineer on 2016.11.14
In this blog post, I’d like to explain how the LINE TODAY service was developed using the Agile development method. LINE TODAY is a mobile news service that was released in Taiwan, Thailand, Indonesia, Myanmar, and the United States in early 2016. As of July 30, 2016, the service recorded nearly 30M daily PV (page views). In Japan, a service similar to LINE TODAY is available under the name of LINE NEWS.
The LINE TODAY development project is a transnational project with users, customers, developers and planners from various countries. Plenty of members participated in the project including developers, planners, QA engineers, UIT (User Interface Technology) engineers, designers, and business owners located across Taiwan, Dalian, and Korea. Moreover, it was a newly created team starting everything from scratch. Most of the developers were rookies unfamiliar to the LINE development environments and skills. No one had the experience of handling a new development project and no one knew the overall process of building a global product.
The LINE TODAY service was planned from the beginning to be released in several countries such as Taiwan, Thailand, and Indonesia. Various requirements had to be taken into account. For example, we had to consider that the content providers (CP) might prefer different types of content feeding mechanisms; Taiwan CPs preferred FTP whereas Thailand and Indonesia CPs preferred RSS.
LINE TODAY was a project with very tight development schedules. The LINE TODAY service was released twice as a FastTrack version and RegularTrack version. The FastTrack version was a proof of concept for evaluating business potential, which was then leveraged to the RegularTrack version aimed at providing the service in the long term. The FastTrack version had to be designed, developed, tested and released in 6 weeks and the RegularTrack version within 3 months after the FastTrack version was released.
The only solution is to be “Agile”
Scrum is known as a good solution for complicated projects. Hence, we decided to introduce Scrum practices into the LINE TODAY development project and practiced the four principles seen below.
- Team spirit
- Self-management and self-development
- Methodologies of communication
- Methodologies of development
Now let me introduce how we applied the Agile development processes and Scrum techniques to this project. (Scrum terms are highlighted with an underscore. For example, a daily stand-up meeting.)
A true Agile team must have team spirit. The key to the success of a project is the team spirit we build in our daily Scrum practices.
Trust and commitment
It is the duty of a Scrum Master to build trust between developers, developers and planners, planners and business owners. The Scrum Master let developers participate in estimating project schedules and making execution plans. No decision was made entirely by the management team. All developers tried their best to meet the schedule they were committed to. Since developers always fulfilled their commitments on time, this led to more trust building up. Planners could trust developers and stakeholders could trust planners.
Innovation and creativity
We encouraged developers to be innovative and to generate creative ideas. We reviewed the ideas suggested by developers through pros/cons analysis. This was done in a planning meeting while discussing execution plans and performing a task breakdown. For example, every developer analyzed the pros and cons and freely gave their opinions on selecting a database framework or data schema.
Why, What, How, and When
First, in planning meetings or backlog grooming sessions, planners explained to the developers on “what they need to do” and “why they need to do it.” It was all about convincing developers why a certain user story is important. Second, developers explained to the planners about the “how” and “when.” Developers explained how much time they would need to implement the user story (complexity estimation) so that planners can adjust priorities based on cost efficiency analysis. Third, developers chose which user stories they would commit in the Sprint based on the priority agreed between planners and developers, and discussed how they would implement them (task breakdown).
Self-management and self-development
With good team spirit, the team leader was able to easily create a self-managed and self-developing team with the standard Scrum practices.
A planning meeting should be segmented into two parts: planning meeting 1 and planning meeting 2. In planning meeting 1, planners explain user stories and epics in a product backlog so that developers can discuss and estimate the complexity. In planning meeting 2, developers choose user stories to commit in the Sprint and break down tasks.
The LINE TODAY Scrum team did not strictly separate planning meetings into the two mentioned above. Instead, the team only maintained the key concept: always let developers see and understand the user stories in advance while planners are still preparing them. Another important practice in a planning meeting was never asking developers to commit a user story that they just saw for the first time. If needed, we arranged a backlog grooming meeting for planners to explain in detail about big user stories or epics. Planners would also explain small user stories in daily stand-up meetings as long as they didn’t exceed 30 minutes.
The LINE TODAY Scrum team incorporated the “King and Servant” model. Any developer could commit a user story in the Sprint and act as a king for the user story, taking full responsibility for completing the user story including on-time delivery, quality, and removing blockers. Other developers would act as servants to support the king to complete the user story.
Daily stand-up meeting
A daily stand-up meeting is where everybody provides brief updates on their progress. Meetings usually last from 15 to 30 minutes.
Team leaders or Scrum Masters are in charge of very important responsibilities in daily stand-up meetings. The team leader or Scrum Master must not only ask people to provide updates on their progress but also help the team build a working model and establish connections. This is especially important for a new team. A new team must find its own way to work together and mature. A team leader should not act as a commander who gives top-down orders. A team leader should make suggestions and coach the team to find its own working model. About two months into the project, the LINE TODAY Scrum team matured has enough to self-organize and demonstrate development teamwork.
It’s important for planners and other stakeholders to understand what has been done. So we had live demos at the end of each Sprint. Generally, a live demo of a user story is from the perspective of users. On the other hand, our developers demonstrated a user story from the perspective of engineering tasks such as API or test automation. A developer, acting as a king, would demonstrate achievements for the user story he or she is responsible for by explaining software design, running unit tests and smoke tests, and presenting performance metrics.
This practice is important for helping developers improve their skills and qualifications. In addition, planners and other stakeholders can get a rough picture of why developers need more time than expected. This can help build trust between planners and developers.
Retrospectives are a very important practice for self-managed and self-developing teams. Team members can find out important items to improve from a retrospective. It’s not about fixing everything. It is more about making routine and gradual self-improvements.
During the first few Sprints, we focused on improving co-development processes such as coding conventions, code branching and merging, CI practices, and so on. These were critical for the LINE TODAY Scrum team in building a mature working model for system development.
We had another interesting practice. Each member would stand up in front of everybody and say what went well, what could be improved, and who to give thanks to. At the end, the team leader or Scrum Master coached the team members to find the top 1 or 2 items to improve for the next Sprint. The Scrum Master would remind members of those items during daily stand-up meetings and review results at the next retrospective.
Methodologies of communication
Inner and outer communications
To communicate efficiently, we set up two communication channels: inner communication and outer communication.
For inner communication, we let the Taiwan office planners consolidate requirements from the Korea office planners, acting as a single source of information for product specifications. For outer communication, Taiwan office planners communicated with Korea office planners, project managers, and other stakeholders including the business teams from each country.
Product backlog, epic, user story, task
To have all team members be on the same page in terms of development tasks and progress, we adopted the JIRA Scrum Board. The LINE TODAY Scrum team created a special epic called “Product Features” and managed epics in two tiers. This way, everyone could have a full picture quickly whether from a planner’s perspective, project manager’s perspective, or from a developer’s perspective. Also, to help planners and management teams easily understand progress by looking at the entire backlog, we set dependencies for user stories to other related user stories or tasks in “Product Features.” With these practices, developers, planners, and other stakeholders could easily read product backlogs.
To make a user story a single source of information, we adhered to the rules below:
- Planners must provide detailed specifications or a link to the specifications in a user story ticket so that anyone can find the specifications easily and quickly.
- Developers must link code commit logs in a task ticket of a user story so that other developers or QA engineers can understand what they have done.
- Any changes to specifications must be escalated to a user story ticket so that their change histories can be traced.
You may have heard stories where planners change specifications unexpectedly, much to the chagrin of developers. The practices above helped reduce miscommunication, and made the age-old conflict between planners and developers a thing of the past.
Tools for daily communication
We used LINE, HipChat, and video conferences for daily communication, but no matter which tool we used, the most effective tool is talking to each other in person. We encouraged developers to face planners and stakeholders directly if they needed answers to unlock their development task at hand.
Methodologies of development
In addition to team spirit, another critical factor that makes a true Agile team is the methodologies the team uses for development.
Reuse existing resources
LINE is a global company well-equipped with global platforms and sharing solutions. We didn’t hesitate to adopt existing resources like Timeline Content API, OBS and CDNs distributed across several regions. Reusing existing platforms or tools can reduce development efforts and help us meet the time-to-market requirements.
Adopting OBS and CDN was the key factor that enabled us to release the LINE TODAY service on time, with the capability to serve five countries with a single original server farm while maintaining high performance. The “Dynamic Site Accelerator” of CDN helped us run the dynamic content module efficiently by processing user comments with the server-side API that provides a cache mechanism.
CI (Continuous Integration) and CD (Continuous Delivery)
Nowadays, CI and CD are very popular software engineering practices adopted by many IT companies. I’m not going to explain what CI and CD are here, but let me share what we have done up until now and how useful they have been to our project.
In a pre-commit/commit build, unit tests and coding conventions are very important for keeping high development quality. Building a powerful unit test framework is essential especially for collaborative development.
Code review is not only a means to maintain good quality but also a method for team-building and human resource development. The LINE TODAY project team built its own Git flow and PR review practices based on the “King and Servant” model. Developers who were familiar with a specific area would act as a king to coach other developers who weren’t so that all developers would acquire the necessary knowledge. They performed the lessons learned in code review exercises.
The LINE TODAY project team had a Taiwanese QA engineer dedicated to test automation. There was also a QA task force in Dalian dedicated to manual testing. The goal was to automate testing as much as possible. Test automation is not cheap but the high cost was certainly worthwhile in the end.
Our team started to gradually build the test automation system in the RegularTrack version. QA engineers visually presented their test automation system in a live demo and explained what the benefits were and what risks could be prevented. During the retrospective, developers also provided feedback and explained how the test automation system helped them.
Extensibility and scalability
We always encouraged developers to discuss system design in terms of extensibility and scalability.
More and more companies are adopting a cloud solution in system design for extensibility and scalability. Following this trend, we also use cloud-based solutions such as MongoDB, Crawler4j, GlusterFS, Redis, RabbitMQ, and Elasticsearch. It takes a lot of time to research and learn new technologies for system design, but we encourage all developers to do so. Developers run sizing and estimate the performance indicator numbers. Then QA engineers or developers build a monitoring environment to track the numbers in beta and production sites. A meeting is held to share and discuss the sizing results and review how and why the design meets the extensibility and scalability requirements.
LINE TODAY – Popular Articles feature
From now on, I will introduce one of the main features of the LINE TODAY service – Popular Articles.
The main page of LINE TODAY is designed for users to browse articles by category. On the main page, we not only provide articles published by local news editors, but we also added smart modules that can show a quick view of popular articles among all users.
There were several technical challenges in making this feature work.
- Real-time (responsiveness): The list of popular articles must be updated dynamically corresponding to the changes of user actions such as reading an article or adding a comment.
- Flexibility: The overall trending mechanism such as fetching raw data, processing user actions, and composing article content must be generic. It should be flexible so that we can change the logic of calculating popular articles and update the module without requiring too much effort.
- Scalability and robustness: Our service components need to process lots of user actions to generate the list of popular articles. Here, we are talking about 30 million daily page views. The system must be strong enough to deal with such a huge amount of traffic and should be highly extensible for scaling up.
Here’s how we designed the overall architecture.
The Web API server is the main component that collects user actions. To collect data in real-time and reduce the overhead on the Web API server, data is stored in temporary storage queues.
RabbitMQ provides high throughput and different levels of data persistence, which made it a good candidate for our service.
The Trending Server is the component that processes user action logs and generates the list of popular articles in real-time based on the collected logs.
To keep the logic of selecting popular articles flexible, we maintain user action logs in raw data instead of pre-aggregating them with pre-set rules. We rely on a search engine to run queries and generate a list of popular articles. Elasticsearch is a powerful framework for real-time data analytics. We use filter querying combined with the common bucket aggregation to meet the basic needs required to generate the list of popular articles. Also, we have extended the querying capabilities to handle more complicated cases by using the scripted aggregation. The Bulk API not only supports querying more efficiently and flexibly but is extremely useful in enhancing indexing speed when user data is inserted. When all the trending related components are ready, the Trending Server runs query commands at very short intervals in the background and caches the output in Redis. Then the Web Server, a module for composing web pages, fetches the most up-to-date list of popular articles. To serve news articles to users all over the world with as short loading times as possible, the articles are cached in a CDN. To reduce download times, browsers fetch the cached content from the closest edge server.
As you can see, the list of popular articles is generated by various components that work together to process a series of tasks. However, when it comes to the testing phase, things become more tricky. There are simply too many user action logs to process. And when user action sequences are added, the combination becomes highly dynamic, making it almost impossible to test and verify features manually.
First, each service component is verified through different levels of unit tests. Then the QA team simulates user actions using the automated integration test script and validates the popular article list generated by the Trending Server based on various combinations of user actions. The test script is run in Jenkins. Jenkins sends an alert email notification if the output is different from what it has expected.
It’s also important to build a monitoring mechanism to make sure that all service components are doing their work properly before releasing to a production site. LINE uses a variety of proprietary tools including IMON and nSight. The tools provide monitoring and alerting functionality from the system level to the application level. RabbitMQ and Elasticsearch provide a large number of service statistics on their web monitoring interface. This is convenient when checking statuses on demand. However, a more proactive approach is required to prevent things from going wrong. A health check script was written to send alerts when the statistics of a certain service component exceeds the pre-configured threshold. The script is integrated with RabbitMQ management API and Elasticsearch cluster/stats/status API. We can easily detect abnormal situations such as growing queue sizes or a cluster member stopping.
So far, this post explained how the LINE TODAY project team has built a true Agile team through various aspects of team spirit, processes, flows, tools, and methodologies. We demonstrated Agile practices and proved the benefits of Agile development. I would like to emphasize again that team spirit is most important and that it’s the starting point that can lead a team in the same direction. All developers in the team did their best to find a solution by working with planners. This was possible because we had good software engineering practices in place. Those practices helped us achieve a high level of quality while also releasing the product quickly.
About the authors
Marco Chen: I’m a leader developer and Scrum Master for the LINE TODAY project. I’m responsible for building Scrum teams, developing processes, and collaboration models.
Yang Ya-Chu: I’m a back end engineer working at LINE Taiwan to design and implement back end systems for LINE services. Currently I’m in charge of designing the “Trending Service System” of LINE TODAY.