Re: [PLUG] >32K concurrent processes

On Mon, Apr 18, 2016 at 4:03 PM, Gavin W. Burris <bug@wharton.upenn.edu> wrote:

Hi, Bhaskar.

That would definitely lend benefits in some ways. I can't help feeling the architecture is showing its age, with the problem of not being able to scale past a single node efficiently. I guess overloading a single host is still advantageous for problems of a certain size. What is the breaking point where the latency of inter-node communication is acceptable? What use case is driving the decision to so strongly avoid going multi-node?

Cheers.

On Mon 04/18/16 03:56PM EDT, K.S. Bhaskar wrote:
> Gavin, the clients and database are the same. The database logic is inside
> application processes, or application logic is inside the database - either
> works out to the same thing. That's the trend these days in very high end
> databases, except that this has been GT.M's architecture since way back
> when.
>
> Regards
> -- Bhaskar
>
> On Mon, Apr 18, 2016 at 3:36 PM, Gavin W. Burris <bug@wharton.upenn.edu>
> wrote:
>
> > Just saw your previous post, that the 32k is for the testing clients, not
> > somehow the database. I'd just spin up more boxes, once you find the
> > optimal number of clients a single one can handle. Keep us posted. This
> > is neat stuff. Cheers.
> >
> > On Mon 04/18/16 03:32PM EDT, Gavin W. Burris wrote:
> > > Hi, Bhaskar.
> > >
> > > This sounds really neat. Why do you need to simultaneously serialize
> > ALL transactions? For instance, my bank balance or my medical records have
> > absolutely no real-time serial dependencies on any other account. Maybe a
> > service free on my medical record is a dependency, but just update those
> > daily. My balance may depend on a transfer, but just look at the posted
> > timestamp. If one needs a bank-wide report, again, just look at
> > transactions to a specific timestamp. What is an acceptable granularity?
> > Sure, you can get millisecond accuracy this way, but why would you want
> > that given the downsides? Is this some kind of high-frequency trading
> > scheme? If so, any further communications will have to be under billable
> > hours for my private consulting services. :D
> > >
> > > Cheers.
> > >
> > > On Mon 04/18/16 02:44PM EDT, K.S. Bhaskar wrote:
> > > > This is not a distributed environment - it's a single system. The
> > reason is
> > > > transaction serialization. When every transaction can potentially
> > depend on
> > > > the result of the preceding transaction, the more you can centralize
> > > > serialization decision making, the faster you can make decisions
> > required
> > > > to ensure ACID properties at transaction commit time. With GT.M, this
> > > > serialization is done in the shared memory of a single computing node.
> > Even
> > > > with technologies such as RDMA over Infiniband, IPC between processes
> > on
> > > > different nodes is one to two orders of magnitude slower than
> > processes on
> > > > a single node. So, as long as throughput is not constrained by the
> > amount
> > > > of CPU, RAM, or IO you can put on a single node, centralized
> > serialization
> > > > gives you the best overall throughput. With GT.M, and with the types of
> > > > computer system you can purchase today, the throughput you can achieve
> > on a
> > > > single node is big enough to handle the needs of real-time
> > core-processing
> > > > (a core system is the system of record for your bank balance) on just
> > about
> > > > any bank. The largest real-time core systems in production anywhere in
> > the
> > > > world today that I know of run on GT.M - these are systems with over 30
> > > > million accounts. In healthcare, the real-time electronic health
> > records
> > > > for the entire Jordanian Ministry of Health system are being rolled
> > out on
> > > > a single system (⅓ of the electronic health records for a country with
> > the
> > > > area and population of Indiana processed on a single system).
> > > >
> > > > What people think of as a horizontally scalable architecture for a
> > > > transactional system is stateless application servers that can be spun
> > up
> > > > as needed, but which send all the needed state to a database under the
> > > > covers. This architecture scales only as well as the database scales
> > on a
> > > > single node, which is to say not very well - in our testing some years
> > ago,
> > > > we found that because of transaction serialization, a popular database
> > > > scaled better on a single node than across multiple nodes in a cluster.
> > > >
> > > > So thanks for all the suggestions but for now, the specific
> > information I
> > > > need is how to configure a Linux system to allow more than 32K
> > concurrent
> > > > processes. Increasing pid_max is a necessary change, but clearly not a
> > > > sufficient change.
> > > >
> > > > Regards
> > > > -- Bhaskar
> > > >
> > > >
> > > >
> > > > On Mon, Apr 18, 2016 at 12:19 PM, Keith C. Perry <
> > kperry@daotechnologies.com
> > > > > wrote:
> > > >
> > > > > Bhasker,
> > > > >
> > > > > What's the deployment infrastructure? When you say "We're trying to
> > run a
> > > > > workload that simulates a large number of concurrent users (as you
> > might
> > > > > find at a large financial or healthcare institution)", it makes me
> > think
> > > > > that is more a distributed environment where you need a large number
> > of
> > > > > clients being serviced by a pool of servers.
> > > > >
> > > > > If that is the case it sound more like you would need a listener
> > (server)
> > > > > running that would then fork off or thread child connections to
> > respond to
> > > > > client requests. This is also something that can be achieved in a
> > local
> > > > > context the listener on the localhost IP or using unix sockets.
> > > > >
> > > > >
> > > > > ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
> > > > > Keith C. Perry, MS E.E.
> > > > > Owner, DAO Technologies LLC
> > > > > (O) +1.215.525.4165 x2033
> > > > > (M) +1.215.432.5167
> > > > > www.daotechnologies.com
> > > > >
> > > > > ------------------------------
> > > > > *From: *"K.S. Bhaskar" <bhaskar@bhaskars.com>
> > > > > *To: *"Philadelphia Linux User's Group Discussion List" <
> > > > > plug@lists.phillylinux.org>
> > > > > *Sent: *Monday, April 18, 2016 11:49:19 AM
> > > > > *Subject: *Re: [PLUG] >32K concurrent processes
> > > > >
> > > > > Thanks for the suggestions, Gavin, but batching the load won't work
> > in
> > > > > this case. We're trying to run a workload that simulates a large
> > number of
> > > > > concurrent users (as you might find at a large financial or
> > healthcare
> > > > > institution) all of whom expect the system to respond immediately
> > when they
> > > > > ask it to do something. I intend to play with the scheduler.
> > > > >
> > > > > Regards
> > > > > -- Bhaskar
> > > > >
> > > > >
> > > > > On Mon, Apr 18, 2016 at 9:13 AM, Gavin W. Burris <
> > bug@wharton.upenn.edu>
> > > > > wrote:
> > > > >
> > > > >> Good morning, Bhaskar.
> > > > >>
> > > > >> Have you considered using /dev/shm aka tmpfs for shared memory on
> > Linux?
> > > > >> Maybe stage all required files there and make sure you are
> > read-only where
> > > > >> possible.
> > > > >>
> > > > >> With so many processes, your system is just constantly changing
> > threads.
> > > > >> Assuming you are not oversubscribing RAM (32GB / 32k is less than
> > 1MB per),
> > > > >> you will want to tune the kernel scheduler.
> > > > >>
> > > > >>
> > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-CPU-Configuration_suggestions.html#sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Configuration_suggestions-Tuning_scheduling_policy
> > > > >>
> > > > >> This very much sounds like an HPC problem (high-performance
> > computing),
> > > > >> so my initial reaction is why not use a resource manager tuned for
> > > > >> high-throughput? Take a look at Open Grid Scheduler (
> > > > >> http://gridscheduler.sourceforge.net/), an open source version of
> > Grid
> > > > >> Engine. This will give you a layer of control, a job queue, where
> > you
> > > > >> could then do a task array. Maybe you could launch 1000 jobs that
> > iterate
> > > > >> 320 times? The job queue could then be tuned to not overload the
> > system
> > > > >> and keep the system maximally / optimally utilized, aka don't run
> > > > >> everything at once but place it in a queue that runs through what
> > you need
> > > > >> as resources are available. I would strongly consider using Grid
> > Engine,
> > > > >> expecially given your statement that the procs "do a teeny bit of
> > activity
> > > > >> every 10 seconds."
> > > > >>
> > > > >> Cheers.
> > > > >>
> > > > >> On Sun 04/17/16 11:12AM EDT, K.S. Bhaskar wrote:
> > > > >> > Thanks for the links Rohit. I'll check them out. The storage is
> > SSD, the
> > > > >> > processes do minimal IO - I'm just trying to establish the
> > ability to
> > > > >> have
> > > > >> > a file open by more than 32K processes, and I'm clearly running
> > into a
> > > > >> > system limit. This is a development machine (16 cores, 32GB RAM -
> > the
> > > > >> > production machine has something like 64 cores and 512GB RAM),
> > but I
> > > > >> can't
> > > > >> > get you access to poke around because it is inside a corporate
> > network.
> > > > >> >
> > > > >> > However, as the software is all open source, I can easily help
> > you get
> > > > >> set
> > > > >> > up to poke around using your own system, if you want. Please let
> > me
> > > > >> know.
> > > > >> >
> > > > >> > Regards
> > > > >> > -- Bhaskar
> > > > >> >
> > > > >> >
> > > > >> > On Sun, Apr 17, 2016 at 10:54 AM, Rohit Mehta <ro@paper-mill.com>
> > > > >> wrote:
> > > > >> >
> > > > >> > > Some kernel parameters to research (which may not be right for
> > your
> > > > >> > > application)
> > > > >> > >
> > > > >> > >
> > > > >>
> > https://www.debian-administration.org/article/656/Installing_Oracle11_and_Oracle12_on_Debian_Wheezy_Squeeze
> > > > >> > > and /etc/security.conf changes
> > > > >> > >
> > > > >>
> > http://stackoverflow.com/questions/9361816/maximum-number-of-processes-in-linux
> > > > >> > >
> > > > >> > > Do these process do a lot of IO? Is your storage rotational
> > media or
> > > > >> > > SSD? Can your application run off ramdisk storage? Have you
> > tried
> > > > >> > > enabling hyperthreading?
> > > > >> > >
> > > > >> > > Do you have the ability to test application loads non-production
> > > > >> system?
> > > > >> > > If so i'd be interesting in helping you poke around. It might
> > be an
> > > > >> > > education for me.
> > > > >> > >
> > > > >> > >
> > > > >> > > On Sun, Apr 17, 2016 at 10:42 AM, Rohit Mehta <
> > ro@paper-mill.com>
> > > > >> wrote:
> > > > >> > >
> > > > >> > >> Back many years ago, I installed Oracle on my Debian
> > workstation for
> > > > >> fun,
> > > > >> > >> and I remember the guide had a lot of tweaks. "ulimit" is one
> > that
> > > > >> I can
> > > > >> > >> think of, but I don't remember them all. I'm poking around the
> > > > >> internet to
> > > > >> > >> see if I can find the oracle guide (although it might not be
> > > > >> relevant on
> > > > >> > >> newer kernels)
> > > > >> > >>
> > > > >> > >> On Sun, Apr 17, 2016 at 10:27 AM, K.S. Bhaskar <
> > bhaskar@bhaskars.com
> > > > >> >
> > > > >> > >> wrote:
> > > > >> > >>
> > > > >> > >>> Thanks Steve, but in this case we have a customer need to
> > crank up
> > > > >> the
> > > > >> > >>> number of processes on Linux.
> > > > >> > >>>
> > > > >> > >>> Regards
> > > > >> > >>> -- Bhaskar
> > > > >> > >>>
> > > > >> > >>> On Sat, Apr 16, 2016 at 4:09 PM, Steve Litt <
> > > > >> slitt@troubleshooters.com>
> > > > >> > >>> wrote:
> > > > >> > >>>
> > > > >> > >>>> On Fri, 15 Apr 2016 17:40:09 -0400
> > > > >> > >>>> "K.S. Bhaskar" <bhaskar@bhaskars.com> wrote:
> > > > >> > >>>>
> > > > >> > >>>> > I am trying to crank up more than 32K concurrent processes
> > (the
> > > > >> > >>>> > processes themselves hang and do a teeny bit of activity
> > every 10
> > > > >> > >>>> > seconds). But the OS (64-bit Debian 8 - Jessie) stubbornly
> > > > >> refuses to
> > > > >> > >>>> > crank up beyond 32K-ish processes. pid_max is set to a
> > very large
> > > > >> > >>>> > number (1M), so that's not it. Any suggestions on what
> > limits to
> > > > >> look
> > > > >> > >>>> > for appreciated. Thank you very much.
> > > > >> > >>>>
> > > > >> > >>>> This is old information, but back in the day people who
> > wanted
> > > > >> lots and
> > > > >> > >>>> lots of processes used one of the BSDs to host that server.
> > > > >> > >>>>
> > > > >> > >>>> SteveT
> > > > >> > >>>>
> > > > >> > >>>> Steve Litt
> > > > >> > >>>> April 2016 featured book: Rapid Learning for the 21st Century
> > > > >> > >>>> http://www.troubleshooters.com/rl21
> > > > >> > >>>>
> > > > >> > >>>>
> > > > >>
> > ___________________________________________________________________________
> > > > >> > >>>> Philadelphia Linux Users Group --
> > > > >> > >>>> http://www.phillylinux.org
> > > > >> > >>>> Announcements -
> > > > >> > >>>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > > >> > >>>> General Discussion --
> > > > >> > >>>> http://lists.phillylinux.org/mailman/listinfo/plug
> > > > >> > >>>>
> > > > >> > >>>
> > > > >> > >>>
> > > > >> > >>>
> > > > >> > >>>
> > > > >>
> > ___________________________________________________________________________
> > > > >> > >>> Philadelphia Linux Users Group --
> > > > >> > >>> http://www.phillylinux.org
> > > > >> > >>> Announcements -
> > > > >> > >>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > > >> > >>> General Discussion --
> > > > >> > >>> http://lists.phillylinux.org/mailman/listinfo/plug
> > > > >> > >>>
> > > > >> > >>>
> > > > >> > >>
> > > > >> > >
> > > > >> > >
> > > > >>
> > ___________________________________________________________________________
> > > > >> > > Philadelphia Linux Users Group --
> > > > >> > > http://www.phillylinux.org
> > > > >> > > Announcements -
> > > > >> > > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > > >> > > General Discussion --
> > > > >> > > http://lists.phillylinux.org/mailman/listinfo/plug
> > > > >> > >
> > > > >> > >
> > > > >>
> > > > >> >
> > > > >>
> > ___________________________________________________________________________
> > > > >> > Philadelphia Linux Users Group --
> > > > >> http://www.phillylinux.org
> > > > >> > Announcements -
> > > > >> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > > >> > General Discussion --
> > > > >> http://lists.phillylinux.org/mailman/listinfo/plug
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Gavin W. Burris
> > > > >> Senior Project Leader for Research Computing
> > > > >> The Wharton School
> > > > >> University of Pennsylvania
> > > > >> Search our documentation:
> > http://research-it.wharton.upenn.edu/about/
> > > > >> Subscribe to the Newsletter:
> > http://whr.tn/ResearchNewsletterSubscribe
> > > > >>
> > > > >>
> > ___________________________________________________________________________
> > > > >> Philadelphia Linux Users Group --
> > > > >> http://www.phillylinux.org
> > > > >> Announcements -
> > > > >> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > > >> General Discussion --
> > > > >> http://lists.phillylinux.org/mailman/listinfo/plug
> > > > >>
> > > > >
> > > > >
> > > > >
> > ___________________________________________________________________________
> > > > > Philadelphia Linux Users Group --
> > > > > http://www.phillylinux.org
> > > > > Announcements -
> > > > > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > > > General Discussion --
> > > > > http://lists.phillylinux.org/mailman/listinfo/plug
> > > > >
> > > > >
> > ___________________________________________________________________________
> > > > > Philadelphia Linux Users Group --
> > > > > http://www.phillylinux.org
> > > > > Announcements -
> > > > > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > > > General Discussion --
> > > > > http://lists.phillylinux.org/mailman/listinfo/plug
> > > > >
> > > > >
> > >
> > > >
> > ___________________________________________________________________________
> > > > Philadelphia Linux Users Group --
> > http://www.phillylinux.org
> > > > Announcements -
> > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > > General Discussion --
> > http://lists.phillylinux.org/mailman/listinfo/plug
> > >
> > >
> > > --
> > > Gavin W. Burris
> > > Senior Project Leader for Research Computing
> > > The Wharton School
> > > University of Pennsylvania
> > > Search our documentation: http://research-it.wharton.upenn.edu/about/
> > > Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
> > >
> > ___________________________________________________________________________
> > > Philadelphia Linux Users Group --
> > http://www.phillylinux.org
> > > Announcements -
> > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > General Discussion --
> > http://lists.phillylinux.org/mailman/listinfo/plug
> >
> > --
> > Gavin W. Burris
> > Senior Project Leader for Research Computing
> > The Wharton School
> > University of Pennsylvania
> > Search our documentation: http://research-it.wharton.upenn.edu/about/
> > Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
> > ___________________________________________________________________________
> > Philadelphia Linux Users Group --
> > http://www.phillylinux.org
> > Announcements -
> > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > General Discussion --
> > http://lists.phillylinux.org/mailman/listinfo/plug
> >

> ___________________________________________________________________________
> Philadelphia Linux Users Group -- http://www.phillylinux.org
> Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
> General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug

--
Gavin W. Burris
Senior Project Leader for Research Computing
The Wharton School
University of Pennsylvania
Search our documentation: http://research-it.wharton.upenn.edu/about/
Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
___________________________________________________________________________
Philadelphia Linux Users Group -- http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug