Download presentation
Presentation is loading. Please wait.
1
How to tidy up - Mark Barnes
IT Janitor How to tidy up - Mark Barnes my name is mark barnes i am the integration team manager at the financial times. integration team is closest we have to full stack engineers over the past year i have lead a project to try and divest ourselves of some of our old kit....
2
the internet and mobile devices have changed the newspaper and media market forever, things are changing faster and faster... the move from print to digital has revolutionised the industry.
3
we have gone from this (old printing press)
4
to this - public clouds in vast DC’s
... and the pace of change is only getting faster
5
and at the same time there has been a clear switch in the way people read FT content
6
mobile devices have increased that pace of change as people consume their business in ways that are convenient to them.. in a typical week the desktop readership disappears at the weekends .. and the newspaper while still important is less so that digital content. all this means that we need to be able to respond to the market much much faster.... so old technologies have to be dealt with.
7
not only has the way we deliver our content changed but the way we pay for delivering it has changed too.. advertising revenue has dropped off a cliff so thats a little context ... this rapidly changing environment is where the FT finds itself...
8
so the technology needs to keep pace as well
now 2.1 3 4 5 6 7 2.5 2.6 8 9 10 11 6.2 2003 2003 R2 2008 2008 R2 2012 2012 R2 NT3.5 NT4 2000 11.11 11.22 11.31 4.3 4.2 so the technology needs to keep pace as well I have been to velocity before .. not as a speaker but a slightly awestruck attendee... and while the speakers were describing the cool things they had made recently i often found myself thinking .. ‘yeh we have some good stuff at the FT .. but we also have some old stuff.. are we the only ones with old stuff???? ’ FT we had some legacy kit... i am guessing quite a few the room have some too I am going to take a gamble here.. i think quite a few of us have some old stuff.. so lets find out how much we all have ... if you can see an operating system up on the screen that you still have (or one even older) then please raise your hand. we will work backwards to see how long the hands stay up... please keep it there (and i will raise mine) while your o/s is still visible
9
a vital point here is that we have also built a new platform - this is important because without it we wouldn't have had a home for service that still needed to exist but simply needed a new home.
10
What we’ll cover today What was the problem?
How we got business buy-in to address the issue? How we decommissioned the old systems The benefits Questions What was the problem? How we got business buy-in to address the issue? How we decommissioned the old systems The benefits Questions
11
what was the problem? teams at the FT are aligned to logical groups of products and services in a broadly devops style ... so there are people that deal with the Content management system, people that deal with our pay wall etc. However some products/services/systems slipped through gaps between teams as everyone either looks at someone else to deal with them or looked at their shoes and hoped they wouldn’t get asked to fix the problem
12
Because we had not been very good at tidying up we had a plethora of old release systems and code repositories that were still in use . for some older systems this could make releasing even a tiny cosmetic change a laborious several weeks long process.
13
Babage Difference engine #2
- the older/ageing systems were costing money and time and were increasingly unreliable... BUT turning off old systems is not glamorous or interesting to most people in technology or in the business... so half the battle was not technical .. it was to do with hearts and minds...
14
needed to create a team dedicated to decommissioning the older kit
needed to create a team dedicated to decommissioning the older kit .. so i needed to convince our stakeholders and business owners that it was a worthwhile endeavour... so i needed carrots and sticks to get people to buy in
15
carrots included the prospect of faster release cycles
16
and cost savings.. we did some digging to surface costs, hosting and support as well as our own internal person hours to look after the ageing systems incidentally - the older parts of were internaly know as ‘classic’ names are important they frame our thinking so it should have been decrepit, or derelict or rickety ft.com easier to win an argument
17
for sticks ... i used fear as the stick....
for sticks ... i used fear as the stick....
18
especially useful as it has scary words like ‘infant mortality’ in it
"Bathtub curve" by Bathtub_curve.jpg: Wyattsderivative work: McSush (talk) - Bathtub_curve.jpg. Licensed under Public domain via Wikimedia Commons - perfect graph for inspiring fear... stolen shamelessly from wikipedia. for those unfamiliar with it you can see the red dotted line is the early failures of a new type of hardware while the kinks are ironed out ... the yellow dots are the hardware starting to wear out and when coupled with the constant random failures of a type of hardware they pan out to be a vague bath tub shape.... especially useful as it has scary words like ‘infant mortality’ in it
19
there are no new security patches coming out for this old stuff you know...
added together that all pretty much won the argument i put together a team and we got started
20
How we did it i am not going to lie to you ... there was quite a lot of work involved
21
There was a fair bit of archeology....
systems people had been to scared to go near in years in some of the dark corners of the data centres were systems that just chugged away and no one had a completely clear view of what they did. there was a lot of work needed to understand them we spoke to people we looked at the output of wireshark and netstat we examined log files
22
we used splunk to understand the traffic
we used tools to analyse the data .. splunk is not cheap ... but it was invaluable here for those that don’t know splunk .. is a tool that “ captures, indexes and correlates real-time data in a searchable repository from which it can generate graphs, reports, alerts, dashboards and visualizations.” thats from the wikipedia page that may have been written by splunks marketing executives basically its a tool for analysing lots of machine generated text data
23
Once we knew more about a service/system we could work out what to do with it
there were a number of options for each system .. we tried very hard to push back if one of the suggested options was .. can’t we just keep it a bit longer... because that is how we got into this mess in the first place different services needed different solutions..
24
Finding owners and asking if they actually cared about the service anymore
there were services that the technology dept at the FT had been looking after diligently for years that the business no longer really cared about, that had been superseded by newer services and products. so a few things fell into a ‘low hanging fruit’ category straight away
25
easy to kill a small % we genuinely could just switch off ... they just needed a 301 redirect here or a ‘can you change that old T & C’s link to a newer version’ etc
26
re-write and re-home new kit new O/S new software quite a lot of small apps doing a specific task could then be re-written and placed on our internal cloud very quickly ... then a dns switch was all that was required.
27
nasty intertwined things with multiple dependencies
the most thorny ones were where - we planned to switch of one thing after another one had been retired, that was in turn dependent on another system ... for example and old cache CVS was still needed as it homed the static assets for an old webserver that an ancient web form still used... that was due to be replaced by a new app entirely but that work has not been prioritised by the product owner yet... and so on the more i thought about this the more it felt like a log jam ... or a traffic jam ...
28
Dashing Roel Berger. We introduced a deadline with a countdown for the most unwilling dashing from shopify ... lovely tool for making dashboards put on teles round the office we stole a countdown widget for use in an instance of dashing. written by a Roel Berger .... if you are in the room please see me afterwards i would like to buy you a drink.
29
(almost) no one did sometimes it is better to beg forgiveness rather than seek permission .... we got increasingly bullish as the project went on. there were occasions where we simply turned something off... and waited for someone to complain click here we had a couple of hiccoughs .. our syndication service had some duplicates in it ... a web site for newspaper retailers to change orders didn’t work for a day ..
30
We started to kill of the systems one by one
Classic Decom Progress in Sprint 4 FT Retailer Methode/Render EPCVS 31st March SOAP SMTP solaris zones 16 physical boxes Paymentech DNS switcher Code out of CVS Code out of PHX p cache FTNI staler grater bibit world pay DE Login We started to kill of the systems one by one we had to keep the team motivated and show progress .. this is a copy of a slide from an internal presentation while we were in the middle of the work and we drew bad pictures of the systems and services represented by tombstones and firing squad victims
31
We made graphs to show progress and keep the team motivated
and we showed the traffic levels dropping as more things were changed - migrated
32
we started to get nostalgic/anthropomorphise
then something odd started to happen .. the ft has a fairly low staff turnover rate and many of us have been working at the FT for a while and some of the these systems we had know for years.... so for example .. solaris 8 is no longer a modern operating system but it was a very reliable one... you could throw rocks at it and it would not let you down ... we have planned to frame a system board from an old V it is important to mark the passing of something dear to you ... people need ‘closure’ :)
33
Benefits? But there is no place for sentimentality here
34
This is the response time for articles on FT.com measured at origin
Performance 30% improvement a few components of still relied on the ‘classic’ stack .. once that had gone the articles generally got faster This is the response time for articles on FT.com measured at origin
35
Speed to market now that we don’t have to code round old applications, or have as many antiquated release tools we are quicker to market. as an example a replatforming of some of the web servers at origin for took a couple of days where it would have once taken several weeks .... oh and thats a diving Peregrine falcon .. with a maximum air speed of 389 km/h
36
Cost savings well over 6 figures for losing that chunk of our estate in hosting and support etc.
37
Working practices so patch cycles - now services are in patch groups on the new estate - we can patch and bounce hosts .. no need for scary 5 year up time... the monitoring - comes out of the box as we provision .. systems are designed with monitoring in mind IT governance has had to catch up with the speed we can get products out - streamlined change and release management processes... not perfect but lots better
38
I sleep better but also there are less tangible benefits .... things that are harder to measure ... now my wages and the roof over my families heads is not dependent on old unreliable systems ... i can sleep a little easier in my bed that monitoring i mentioned ... so we now have a better view of whats working and whats not... speed to fix has been improved as well as reliability
39
here be dragons .. well it turns out there aren’t
There goes the fear “Hic sunt dracones” Hunt-Lennox globe - wikipedia (again) here be dragons .. well it turns out there aren’t we are considerably less scared of turning off the old stuff that no one understands any more - that organisational culture change is for me the biggest thing ... turning off old systems is ‘normal’ now
40
Something else i had not really considered in advance was the human resource benefit ...
we lost a few key person dependencies - and peoples times was freed up to work on new shiny things...we no longer need to devote as much time to looking after creaking systems ... oh and did i mention we are hiring?
41
an important point to note is the job is never really finished ..
i called it a project earlier .. that implies there it is a distinct task with a beginning and an end... forth bridge - a metaphor for a never ending task .. though i understand that technology has enabled them to stop constantly painting it .. there will always be something to clean up .... as the pace of change keeps up .. we will always need someone with a broom
42
£ $ € needed even more in the public cloud ... costs will grow without some control
43
some possible solutions we are working with
next steps .. the job is not done yet ... there are still products that slip through ‘ownership’ gaps and when someone builds a new thing we don’t yet insist on an exit strategy, (a prediction for the future of a product) (for example) on how we turn something off - make things replaceable components/micro services some possible solutions we are working with restful apis for everything standardisation for things like transaction id’s in log
44
Media world is changing faster than ever and we have to react
The hard work started with hearts and minds We persuaded people into backing the project We had to understand the systems we wanted to decommission Different services required different solutions There were huge benefits to decommissioning Both intangible and very (cold hard cash) tangible Media world is changing faster than ever and we have to react The hard work started with hearts and minds We persuaded people into backing the project We had to understand the systems we wanted to decommission Different services required different solutions There were huge benefits to decommissioning Both intangible and very (cold hard cash) tangible
45
Questions?
46
Other FT talks at Velocity
Andrew Betts 9:40 Tuesday 18th Plenary Upgrading the Web: Polyfills, Components and the Future of Web Development at Scale Kornel Lesiński 11:50 Tuesday 18th Performance track Lossy Compression of True-color PNG Images Matt Andrews 16:00 Tuesday 18th Performance track Offline first web apps
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.