Вы находитесь на странице: 1из 23

Yet Another Rails Scaling Presentation

RubyonRailsMeetup May10,2007 JaredFriedman(jared@scribd.com)and TikhonBernstam(tikhon@scribd.com)

Shouldyoubotherwith scaling?
l Well,itdepends l Butifyourelaunchingastartup,probably l Thebestwaytolaunchastartupthesedays

istogetitonTechCrunch,Digg,Reddit,etc. l Youdontgetasmuchtimetogrow organicallyasyouusedto l Youonlygetonelaunchdontwantyour sitetofallover

ThePredecessors
l Othergreatplacestolookforinfoonthis l

poocs.netTheAdventuresofScalingRails
http://poocs.net/2006/3/13/theadventuresofscalingstage1

l StephenKaesPerformanceRails
http://railsexpress.de/blog/files/slides/rubyenrails2006.pdf

l RobotCoopblogandgems
http://www.robotcoop.com/articles/2006/10/10/thesoftwareandhardwarethatrunsoursites

l OreillybookHighPerformanceMySQL
l

Itsnotrails,butitsreallyuseful

BigPicture
l Thispresentationwillconcentrateonwhats

differentfrompreviouswritings,nota comprehensiveoverview l Availableathttp://www.scribd.com/blog

Whoweare
l Scribd.com l LikeYouTubefordocuments l LaunchedinMarch,2007 l Handles~1Mrequestsperday

KeyPoints
l Generalarchitecture l Usefragmentcaching! l Rollingyourowntrafficanalyticsandsome

SQLtips

CurrentScribdarchitecture
l 1WebServer l 3DatabaseServers l 3Documentconversionservers l Testandbackupmachines l AmazonS3

ServerHardware
l Dual,dualcorewoodcrestsat3GHz l 16GBofmemory l 415KSCSCIharddrivesinaRAID10 l Welearned:diskspeedisimportant l Don'tskimpyourenotGoogle,andit's

easiertoscaleupthanout l Softlayerisagreatdedicatedhosting company

Varioussoftwaredetails
l CentOS l Apache/Mongrel l Memcached,RobotCoopsmemcacheclient l StefanKaesSQLSessionStore
l

Bestwaytostorepersistentsessions

l Monit,Capistrano l Postfix

FragmentCaching
"Wedontuseanypageorfragment caching."robotcoop l "Playwithfragmentcaching...no improvement,changeswererevertedata latertime."poocs.net l Well,maybeit'sapplicationspecific l Scribdusesfragmentcachingextensively, enormousperformanceimprovement
l

ScreenShot

HowtoUseFragmentCaching
l l l

l l

Ignoreallbutthemostfrequentlyaccessedpages Lookforpiecesofthepagethatdon'tchangeon everypageviewandareexpensivetocompute Justwrapthemina <%cache('keyname)do%> <%end%> Dotimingtestbeforeandafterwardsbacktrack unlesssignificantperformancegains Wesee>10X

Expiringfragments,1.Timebased
l Youshouldreallyusememcachedforstoring

fragments
Betterperformance l Easiertoscaletomultipleservers l Mostimportant:allowstimebasedexpiration
l

l Usepluginhttp://agilewebdevelopment.com/plugins/memcache_fragments_with_time_expiry l Deadeasy:
<%cache'keyname,:expire=>10.minutesdo%>

...
<%end%>

Expiringfragments,2.Manually
l Noneedtoservestaledata l Justuse:

Cache.delete("fragment:/partials/whatever")
l Clearfragmentswheneverdatachanges l Again,easierwithmemcached

TrafficAnalytics
l GoogleAnalyticsisnice,buttherearealotof

reasonstorollyourowntrafficanalyticstoo
l l l

Canbemuchmorepowerful YoucanwriteSQLtoanswerarbitraryquestions Canexposetousers

Scribdsanalytics (screenshots)

Buildingtrafficanalytics,part1
l

create_tablepage_viewsdo|t| t.columnuser_id,:integer t.columnrequest_url,:string,:limit=>200 t.columnsession,:string,:limit=>32 t.columnip_address,:string,:limit=>16 t.columnreferer,:string,:limit=>200 t.columnuser_agent,:string,:limit=>200 t.columncreated_at,:timestamp end Addawholebunchofindexes,dependingonqueries

Buildingtrafficanalytics,part2
l CreateaPageViewoneveryrequest l WeusedahandbuiltSQLquerytotakeout

theActiveRecordoverheadonthis l MighttryMySQLsinsertdelayed l Analyticsqueriesareusuallyhandcoded SQL l UseexplainselecttomakesureMySQLis usingtheindexesyouexpect

BuildingTrafficAnalytics,part3
l Scalesprettywell l BUTanalyticsqueriesexpensive,canclogup

mainDBserver l Oursolution:
l l

usetwoDBserversinamaster/slavesetup movealltheanalyticsqueriestotheslave

Railswithmultipledatabases,part1
l

l l l

"AtthispointintimetheresnofacilityinRailstotalk tomorethanonedatabaseatatime."AlexPayne, Twitterdeveloper Wellthat'strue Butsettingthingsupyourselfisabout10linesof code. Therearenowalsotwogreatpluginsfordoingthis: Magicmulticonnections http://magicmodels.rubyforge.org/magic_multi_conn ections/ Actsasreadonlyable http://rubyforge.org/frs/?group_id=3451

Railswithmultipledatabases,part2
l AtScribdweusethistosendpredefined

expensivequeriestoaslave l Thiscanbeveryimportantfordealingwith lockcontentionissues l Youcouldalsodoautomaticloadbalancing, butsynchronizationbecomesmore complicated(readaSQLbook,notaRails issue)

Railswithmultipledatabases,code
l

Indatabase.yml
slave1: host:18.48.43.29#yourslavesIP database:production username:root password:pass

DefineamodelSlave1.rb

classSlave1<ActiveRecord::Base self.abstract_class=true establish_connection:slave1 end


l

Whenyouneedtorunaqueryontheslave,justdo
Slave1.connection.execute("select*fromsome_table")

ShamelessSelfPromotion
l Scribd.com:VCbackedandhiring l Just3peoplesofar!>10byendofyear. l Awesomesalary/equitycombination l Ifyourereadingthis,youreprobablythe

rightkindofperson l Buildingtheworld'slargestopendocument library l Email:hackers@scribd.com