Gavin W. Burris on 18 Apr 2016 12:36:36 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] >32K concurrent processes


Just saw your previous post, that the 32k is for the testing clients, not somehow the database.  I'd just spin up more boxes, once you find the optimal number of clients a single one can handle.  Keep us posted.  This is neat stuff.  Cheers.

On Mon 04/18/16 03:32PM EDT, Gavin W. Burris wrote:
> Hi, Bhaskar.
> 
> This sounds really neat.  Why do you need to simultaneously serialize ALL transactions?  For instance, my bank balance or my medical records have absolutely no real-time serial dependencies on any other account.  Maybe a service free on my medical record is a dependency, but just update those daily.  My balance may depend on a transfer, but just look at the posted timestamp.  If one needs a bank-wide report, again, just look at transactions to a specific timestamp.  What is an acceptable granularity?  Sure, you can get millisecond accuracy this way, but why would you want that given the downsides?  Is this some kind of high-frequency trading scheme?  If so, any further communications will have to be under billable hours for my private consulting services.  :D
> 
> Cheers.
> 
> On Mon 04/18/16 02:44PM EDT, K.S. Bhaskar wrote:
> > This is not a distributed environment - it's a single system. The reason is
> > transaction serialization. When every transaction can potentially depend on
> > the result of the preceding transaction, the more you can centralize
> > serialization decision making, the faster you can make decisions required
> > to ensure ACID properties at transaction commit time. With GT.M, this
> > serialization is done in the shared memory of a single computing node. Even
> > with technologies such as RDMA over Infiniband, IPC between processes on
> > different nodes is one to two orders of magnitude slower than processes on
> > a single node. So, as long as throughput is not constrained by the amount
> > of CPU, RAM, or IO you can put on a single node, centralized serialization
> > gives you the best overall throughput. With GT.M, and with the types of
> > computer system you can purchase today, the throughput you can achieve on a
> > single node is big enough to handle the needs of real-time core-processing
> > (a core system is the system of record for your bank balance) on just about
> > any bank. The largest real-time core systems in production anywhere in the
> > world today that I know of run on GT.M - these are systems with over 30
> > million accounts. In healthcare, the real-time electronic health records
> > for the entire Jordanian Ministry of Health system are being rolled out on
> > a single system (⅓ of the electronic health records for a country with the
> > area and population of Indiana processed on a single system).
> > 
> > What people think of as a horizontally scalable architecture for a
> > transactional system is stateless application servers that can be spun up
> > as needed, but which send all the needed state to a database under the
> > covers. This architecture scales only as well as the database scales on a
> > single node, which is to say not very well - in our testing some years ago,
> > we found that because of transaction serialization, a popular database
> > scaled better on a single node than across multiple nodes in a cluster.
> > 
> > So thanks for all the suggestions but for now, the specific information I
> > need is how to configure a Linux system to allow more than 32K concurrent
> > processes. Increasing pid_max is a necessary change, but clearly not a
> > sufficient change.
> > 
> > Regards
> > -- Bhaskar
> > 
> > 
> > 
> > On Mon, Apr 18, 2016 at 12:19 PM, Keith C. Perry <kperry@daotechnologies.com
> > > wrote:
> > 
> > > Bhasker,
> > >
> > > What's the deployment infrastructure?  When you say "We're trying to run a
> > > workload that simulates a large number of concurrent users (as you might
> > > find at a large financial or healthcare institution)", it makes me think
> > > that is more a distributed environment where you need a large number of
> > > clients being serviced by a pool of servers.
> > >
> > > If that is the case it sound more like you would need a listener (server)
> > > running that would then fork off or thread child connections to respond to
> > > client requests.  This is also something that can be achieved in a local
> > > context the listener on the localhost IP or using unix sockets.
> > >
> > >
> > > ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
> > > Keith C. Perry, MS E.E.
> > > Owner, DAO Technologies LLC
> > > (O) +1.215.525.4165 x2033
> > > (M) +1.215.432.5167
> > > www.daotechnologies.com
> > >
> > > ------------------------------
> > > *From: *"K.S. Bhaskar" <bhaskar@bhaskars.com>
> > > *To: *"Philadelphia Linux User's Group Discussion List" <
> > > plug@lists.phillylinux.org>
> > > *Sent: *Monday, April 18, 2016 11:49:19 AM
> > > *Subject: *Re: [PLUG] >32K concurrent processes
> > >
> > > Thanks for the suggestions, Gavin, but batching the load won't work in
> > > this case. We're trying to run a workload that simulates a large number of
> > > concurrent users (as you might find at a large financial or healthcare
> > > institution) all of whom expect the system to respond immediately when they
> > > ask it to do something. I intend to play with the scheduler.
> > >
> > > Regards
> > > -- Bhaskar
> > >
> > >
> > > On Mon, Apr 18, 2016 at 9:13 AM, Gavin W. Burris <bug@wharton.upenn.edu>
> > > wrote:
> > >
> > >> Good morning, Bhaskar.
> > >>
> > >> Have you considered using /dev/shm aka tmpfs for shared memory on Linux?
> > >> Maybe stage all required files there and make sure you are read-only where
> > >> possible.
> > >>
> > >> With so many processes, your system is just constantly changing threads.
> > >> Assuming you are not oversubscribing RAM (32GB / 32k is less than 1MB per),
> > >> you will want to tune the kernel scheduler.
> > >>
> > >> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-CPU-Configuration_suggestions.html#sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Configuration_suggestions-Tuning_scheduling_policy
> > >>
> > >> This very much sounds like an HPC problem (high-performance computing),
> > >> so my initial reaction is why not use a resource manager tuned for
> > >> high-throughput?  Take a look at Open Grid Scheduler (
> > >> http://gridscheduler.sourceforge.net/), an open source version of Grid
> > >> Engine.  This will give you a layer of control, a job queue, where you
> > >> could then do a task array.  Maybe you could launch 1000 jobs that iterate
> > >> 320 times?  The job queue could then be tuned to not overload the system
> > >> and keep the system maximally / optimally utilized, aka don't run
> > >> everything at once but place it in a queue that runs through what you need
> > >> as resources are available.  I would strongly consider using Grid Engine,
> > >> expecially given your statement that the procs "do a teeny bit of activity
> > >> every 10 seconds."
> > >>
> > >> Cheers.
> > >>
> > >> On Sun 04/17/16 11:12AM EDT, K.S. Bhaskar wrote:
> > >> > Thanks for the links Rohit. I'll check them out. The storage is SSD, the
> > >> > processes do minimal IO - I'm just trying to establish the ability to
> > >> have
> > >> > a file open by more than 32K processes, and I'm clearly running into a
> > >> > system limit. This is a development machine (16 cores, 32GB RAM - the
> > >> > production machine has something like 64 cores and 512GB RAM), but I
> > >> can't
> > >> > get you access to poke around because it is inside a corporate network.
> > >> >
> > >> > However, as the software is all open source, I can easily help you get
> > >> set
> > >> > up to poke around using your own system, if you want. Please let me
> > >> know.
> > >> >
> > >> > Regards
> > >> > -- Bhaskar
> > >> >
> > >> >
> > >> > On Sun, Apr 17, 2016 at 10:54 AM, Rohit Mehta <ro@paper-mill.com>
> > >> wrote:
> > >> >
> > >> > > Some kernel parameters to research (which may not be right for your
> > >> > > application)
> > >> > >
> > >> > >
> > >> https://www.debian-administration.org/article/656/Installing_Oracle11_and_Oracle12_on_Debian_Wheezy_Squeeze
> > >> > > and /etc/security.conf changes
> > >> > >
> > >> http://stackoverflow.com/questions/9361816/maximum-number-of-processes-in-linux
> > >> > >
> > >> > > Do these process do a lot of IO?  Is your storage rotational media or
> > >> > > SSD?  Can your application run off ramdisk storage?  Have you tried
> > >> > > enabling hyperthreading?
> > >> > >
> > >> > > Do you have the ability to test application loads non-production
> > >> system?
> > >> > > If so i'd be interesting in helping you poke around.  It might be an
> > >> > > education for me.
> > >> > >
> > >> > >
> > >> > > On Sun, Apr 17, 2016 at 10:42 AM, Rohit Mehta <ro@paper-mill.com>
> > >> wrote:
> > >> > >
> > >> > >> Back many years ago, I installed Oracle on my Debian workstation for
> > >> fun,
> > >> > >> and I remember the guide had a lot of tweaks.  "ulimit" is one that
> > >> I can
> > >> > >> think of, but I don't remember them all.  I'm poking around the
> > >> internet to
> > >> > >> see if I can find the oracle guide (although it might not be
> > >> relevant on
> > >> > >> newer kernels)
> > >> > >>
> > >> > >> On Sun, Apr 17, 2016 at 10:27 AM, K.S. Bhaskar <bhaskar@bhaskars.com
> > >> >
> > >> > >> wrote:
> > >> > >>
> > >> > >>> Thanks Steve, but in this case we have a customer need to crank up
> > >> the
> > >> > >>> number of processes on Linux.
> > >> > >>>
> > >> > >>> Regards
> > >> > >>> -- Bhaskar
> > >> > >>>
> > >> > >>> On Sat, Apr 16, 2016 at 4:09 PM, Steve Litt <
> > >> slitt@troubleshooters.com>
> > >> > >>> wrote:
> > >> > >>>
> > >> > >>>> On Fri, 15 Apr 2016 17:40:09 -0400
> > >> > >>>> "K.S. Bhaskar" <bhaskar@bhaskars.com> wrote:
> > >> > >>>>
> > >> > >>>> > I am trying to crank up more than 32K concurrent processes (the
> > >> > >>>> > processes themselves hang and do a teeny bit of activity every 10
> > >> > >>>> > seconds). But the OS (64-bit Debian 8 - Jessie) stubbornly
> > >> refuses to
> > >> > >>>> > crank up beyond 32K-ish processes. pid_max is set to a very large
> > >> > >>>> > number (1M), so that's not it. Any suggestions on what limits to
> > >> look
> > >> > >>>> > for appreciated. Thank you very much.
> > >> > >>>>
> > >> > >>>> This is old information, but back in the day people who wanted
> > >> lots and
> > >> > >>>> lots of processes used one of the BSDs to host that server.
> > >> > >>>>
> > >> > >>>> SteveT
> > >> > >>>>
> > >> > >>>> Steve Litt
> > >> > >>>> April 2016 featured book: Rapid Learning for the 21st Century
> > >> > >>>> http://www.troubleshooters.com/rl21
> > >> > >>>>
> > >> > >>>>
> > >> ___________________________________________________________________________
> > >> > >>>> Philadelphia Linux Users Group         --
> > >> > >>>> http://www.phillylinux.org
> > >> > >>>> Announcements -
> > >> > >>>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > >> > >>>> General Discussion  --
> > >> > >>>> http://lists.phillylinux.org/mailman/listinfo/plug
> > >> > >>>>
> > >> > >>>
> > >> > >>>
> > >> > >>>
> > >> > >>>
> > >> ___________________________________________________________________________
> > >> > >>> Philadelphia Linux Users Group         --
> > >> > >>> http://www.phillylinux.org
> > >> > >>> Announcements -
> > >> > >>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > >> > >>> General Discussion  --
> > >> > >>> http://lists.phillylinux.org/mailman/listinfo/plug
> > >> > >>>
> > >> > >>>
> > >> > >>
> > >> > >
> > >> > >
> > >> ___________________________________________________________________________
> > >> > > Philadelphia Linux Users Group         --
> > >> > > http://www.phillylinux.org
> > >> > > Announcements -
> > >> > > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > >> > > General Discussion  --
> > >> > > http://lists.phillylinux.org/mailman/listinfo/plug
> > >> > >
> > >> > >
> > >>
> > >> >
> > >> ___________________________________________________________________________
> > >> > Philadelphia Linux Users Group         --
> > >> http://www.phillylinux.org
> > >> > Announcements -
> > >> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > >> > General Discussion  --
> > >> http://lists.phillylinux.org/mailman/listinfo/plug
> > >>
> > >>
> > >> --
> > >> Gavin W. Burris
> > >> Senior Project Leader for Research Computing
> > >> The Wharton School
> > >> University of Pennsylvania
> > >> Search our documentation: http://research-it.wharton.upenn.edu/about/
> > >> Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
> > >>
> > >> ___________________________________________________________________________
> > >> Philadelphia Linux Users Group         --
> > >> http://www.phillylinux.org
> > >> Announcements -
> > >> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > >> General Discussion  --
> > >> http://lists.phillylinux.org/mailman/listinfo/plug
> > >>
> > >
> > >
> > > ___________________________________________________________________________
> > > Philadelphia Linux Users Group         --
> > > http://www.phillylinux.org
> > > Announcements -
> > > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > General Discussion  --
> > > http://lists.phillylinux.org/mailman/listinfo/plug
> > >
> > > ___________________________________________________________________________
> > > Philadelphia Linux Users Group         --
> > > http://www.phillylinux.org
> > > Announcements -
> > > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > General Discussion  --
> > > http://lists.phillylinux.org/mailman/listinfo/plug
> > >
> > >
> 
> > ___________________________________________________________________________
> > Philadelphia Linux Users Group         --        http://www.phillylinux.org
> > Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug
> 
> 
> -- 
> Gavin W. Burris
> Senior Project Leader for Research Computing
> The Wharton School
> University of Pennsylvania
> Search our documentation: http://research-it.wharton.upenn.edu/about/
> Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
> ___________________________________________________________________________
> Philadelphia Linux Users Group         --        http://www.phillylinux.org
> Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
> General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

-- 
Gavin W. Burris
Senior Project Leader for Research Computing
The Wharton School
University of Pennsylvania
Search our documentation: http://research-it.wharton.upenn.edu/about/
Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug