Re: [PLUG] >32K concurrent processes

Gavin W. Burris on 18 Apr 2016 12:32:51 -0700

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] >32K concurrent processes

From: "Gavin W. Burris" <bug@wharton.upenn.edu>
To: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>
Subject: Re: [PLUG] >32K concurrent processes
Date: Mon, 18 Apr 2016 15:32:42 -0400
Authentication-results: lists.phillylinux.org; dkim=none (message not signed) header.d=none; lists.phillylinux.org; dmarc=none action=none header.from=wharton.upenn.edu;
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=PennO365.onmicrosoft.com; s=selector1-wharton-upenn-edu; h=From:To:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=uEspPfIR5kCvxbzyMquUKUUL6qQfNXUg1Ry/wZ5uFfc=; b=go8/hrVDb/7LCsjSiOiEzyZV/4UrfCvG8kB+XgptwPVXx6tLTQwETWGzn9StdIG7deTI4SWDpzA7RzcHrchIRlfc/A2YRIkiq8TmvYJYl8E5Kcq9k19c/7OQcWMz1zqysoZ8rDLHTmuLQIRzElPqwfOAwwsn+5tVPMqNAtxMaQk=
Reply-to: Philadelphia Linux User's Group Discussion List <plug@lists.phillylinux.org>
Sender: "plug" <plug-bounces@lists.phillylinux.org>
Spamdiagnosticmetadata: NSPM
Spamdiagnosticoutput: 1:23
User-agent: Mutt/1.5.24 (2015-08-30)

Hi, Bhaskar.

This sounds really neat.  Why do you need to simultaneously serialize ALL transactions?  For instance, my bank balance or my medical records have absolutely no real-time serial dependencies on any other account.  Maybe a service free on my medical record is a dependency, but just update those daily.  My balance may depend on a transfer, but just look at the posted timestamp.  If one needs a bank-wide report, again, just look at transactions to a specific timestamp.  What is an acceptable granularity?  Sure, you can get millisecond accuracy this way, but why would you want that given the downsides?  Is this some kind of high-frequency trading scheme?  If so, any further communications will have to be under billable hours for my private consulting services.  :D

Cheers.

On Mon 04/18/16 02:44PM EDT, K.S. Bhaskar wrote:
> This is not a distributed environment - it's a single system. The reason is
> transaction serialization. When every transaction can potentially depend on
> the result of the preceding transaction, the more you can centralize
> serialization decision making, the faster you can make decisions required
> to ensure ACID properties at transaction commit time. With GT.M, this
> serialization is done in the shared memory of a single computing node. Even
> with technologies such as RDMA over Infiniband, IPC between processes on
> different nodes is one to two orders of magnitude slower than processes on
> a single node. So, as long as throughput is not constrained by the amount
> of CPU, RAM, or IO you can put on a single node, centralized serialization
> gives you the best overall throughput. With GT.M, and with the types of
> computer system you can purchase today, the throughput you can achieve on a
> single node is big enough to handle the needs of real-time core-processing
> (a core system is the system of record for your bank balance) on just about
> any bank. The largest real-time core systems in production anywhere in the
> world today that I know of run on GT.M - these are systems with over 30
> million accounts. In healthcare, the real-time electronic health records
> for the entire Jordanian Ministry of Health system are being rolled out on
> a single system (⅓ of the electronic health records for a country with the
> area and population of Indiana processed on a single system).
> 
> What people think of as a horizontally scalable architecture for a
> transactional system is stateless application servers that can be spun up
> as needed, but which send all the needed state to a database under the
> covers. This architecture scales only as well as the database scales on a
> single node, which is to say not very well - in our testing some years ago,
> we found that because of transaction serialization, a popular database
> scaled better on a single node than across multiple nodes in a cluster.
> 
> So thanks for all the suggestions but for now, the specific information I
> need is how to configure a Linux system to allow more than 32K concurrent
> processes. Increasing pid_max is a necessary change, but clearly not a
> sufficient change.
> 
> Regards
> -- Bhaskar
> 
> 
> 
> On Mon, Apr 18, 2016 at 12:19 PM, Keith C. Perry <kperry@daotechnologies.com
> > wrote:
> 
> > Bhasker,
> >
> > What's the deployment infrastructure?  When you say "We're trying to run a
> > workload that simulates a large number of concurrent users (as you might
> > find at a large financial or healthcare institution)", it makes me think
> > that is more a distributed environment where you need a large number of
> > clients being serviced by a pool of servers.
> >
> > If that is the case it sound more like you would need a listener (server)
> > running that would then fork off or thread child connections to respond to
> > client requests.  This is also something that can be achieved in a local
> > context the listener on the localhost IP or using unix sockets.
> >
> >
> > ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
> > Keith C. Perry, MS E.E.
> > Owner, DAO Technologies LLC
> > (O) +1.215.525.4165 x2033
> > (M) +1.215.432.5167
> > www.daotechnologies.com
> >
> > ------------------------------
> > *From: *"K.S. Bhaskar" <bhaskar@bhaskars.com>
> > *To: *"Philadelphia Linux User's Group Discussion List" <
> > plug@lists.phillylinux.org>
> > *Sent: *Monday, April 18, 2016 11:49:19 AM
> > *Subject: *Re: [PLUG] >32K concurrent processes
> >
> > Thanks for the suggestions, Gavin, but batching the load won't work in
> > this case. We're trying to run a workload that simulates a large number of
> > concurrent users (as you might find at a large financial or healthcare
> > institution) all of whom expect the system to respond immediately when they
> > ask it to do something. I intend to play with the scheduler.
> >
> > Regards
> > -- Bhaskar
> >
> >
> > On Mon, Apr 18, 2016 at 9:13 AM, Gavin W. Burris <bug@wharton.upenn.edu>
> > wrote:
> >
> >> Good morning, Bhaskar.
> >>
> >> Have you considered using /dev/shm aka tmpfs for shared memory on Linux?
> >> Maybe stage all required files there and make sure you are read-only where
> >> possible.
> >>
> >> With so many processes, your system is just constantly changing threads.
> >> Assuming you are not oversubscribing RAM (32GB / 32k is less than 1MB per),
> >> you will want to tune the kernel scheduler.
> >>
> >> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-CPU-Configuration_suggestions.html#sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Configuration_suggestions-Tuning_scheduling_policy
> >>
> >> This very much sounds like an HPC problem (high-performance computing),
> >> so my initial reaction is why not use a resource manager tuned for
> >> high-throughput?  Take a look at Open Grid Scheduler (
> >> http://gridscheduler.sourceforge.net/), an open source version of Grid
> >> Engine.  This will give you a layer of control, a job queue, where you
> >> could then do a task array.  Maybe you could launch 1000 jobs that iterate
> >> 320 times?  The job queue could then be tuned to not overload the system
> >> and keep the system maximally / optimally utilized, aka don't run
> >> everything at once but place it in a queue that runs through what you need
> >> as resources are available.  I would strongly consider using Grid Engine,
> >> expecially given your statement that the procs "do a teeny bit of activity
> >> every 10 seconds."
> >>
> >> Cheers.
> >>
> >> On Sun 04/17/16 11:12AM EDT, K.S. Bhaskar wrote:
> >> > Thanks for the links Rohit. I'll check them out. The storage is SSD, the
> >> > processes do minimal IO - I'm just trying to establish the ability to
> >> have
> >> > a file open by more than 32K processes, and I'm clearly running into a
> >> > system limit. This is a development machine (16 cores, 32GB RAM - the
> >> > production machine has something like 64 cores and 512GB RAM), but I
> >> can't
> >> > get you access to poke around because it is inside a corporate network.
> >> >
> >> > However, as the software is all open source, I can easily help you get
> >> set
> >> > up to poke around using your own system, if you want. Please let me
> >> know.
> >> >
> >> > Regards
> >> > -- Bhaskar
> >> >
> >> >
> >> > On Sun, Apr 17, 2016 at 10:54 AM, Rohit Mehta <ro@paper-mill.com>
> >> wrote:
> >> >
> >> > > Some kernel parameters to research (which may not be right for your
> >> > > application)
> >> > >
> >> > >
> >> https://www.debian-administration.org/article/656/Installing_Oracle11_and_Oracle12_on_Debian_Wheezy_Squeeze
> >> > > and /etc/security.conf changes
> >> > >
> >> http://stackoverflow.com/questions/9361816/maximum-number-of-processes-in-linux
> >> > >
> >> > > Do these process do a lot of IO?  Is your storage rotational media or
> >> > > SSD?  Can your application run off ramdisk storage?  Have you tried
> >> > > enabling hyperthreading?
> >> > >
> >> > > Do you have the ability to test application loads non-production
> >> system?
> >> > > If so i'd be interesting in helping you poke around.  It might be an
> >> > > education for me.
> >> > >
> >> > >
> >> > > On Sun, Apr 17, 2016 at 10:42 AM, Rohit Mehta <ro@paper-mill.com>
> >> wrote:
> >> > >
> >> > >> Back many years ago, I installed Oracle on my Debian workstation for
> >> fun,
> >> > >> and I remember the guide had a lot of tweaks.  "ulimit" is one that
> >> I can
> >> > >> think of, but I don't remember them all.  I'm poking around the
> >> internet to
> >> > >> see if I can find the oracle guide (although it might not be
> >> relevant on
> >> > >> newer kernels)
> >> > >>
> >> > >> On Sun, Apr 17, 2016 at 10:27 AM, K.S. Bhaskar <bhaskar@bhaskars.com
> >> >
> >> > >> wrote:
> >> > >>
> >> > >>> Thanks Steve, but in this case we have a customer need to crank up
> >> the
> >> > >>> number of processes on Linux.
> >> > >>>
> >> > >>> Regards
> >> > >>> -- Bhaskar
> >> > >>>
> >> > >>> On Sat, Apr 16, 2016 at 4:09 PM, Steve Litt <
> >> slitt@troubleshooters.com>
> >> > >>> wrote:
> >> > >>>
> >> > >>>> On Fri, 15 Apr 2016 17:40:09 -0400
> >> > >>>> "K.S. Bhaskar" <bhaskar@bhaskars.com> wrote:
> >> > >>>>
> >> > >>>> > I am trying to crank up more than 32K concurrent processes (the
> >> > >>>> > processes themselves hang and do a teeny bit of activity every 10
> >> > >>>> > seconds). But the OS (64-bit Debian 8 - Jessie) stubbornly
> >> refuses to
> >> > >>>> > crank up beyond 32K-ish processes. pid_max is set to a very large
> >> > >>>> > number (1M), so that's not it. Any suggestions on what limits to
> >> look
> >> > >>>> > for appreciated. Thank you very much.
> >> > >>>>
> >> > >>>> This is old information, but back in the day people who wanted
> >> lots and
> >> > >>>> lots of processes used one of the BSDs to host that server.
> >> > >>>>
> >> > >>>> SteveT
> >> > >>>>
> >> > >>>> Steve Litt
> >> > >>>> April 2016 featured book: Rapid Learning for the 21st Century
> >> > >>>> http://www.troubleshooters.com/rl21
> >> > >>>>
> >> > >>>>
> >> ___________________________________________________________________________
> >> > >>>> Philadelphia Linux Users Group         --
> >> > >>>> http://www.phillylinux.org
> >> > >>>> Announcements -
> >> > >>>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> >> > >>>> General Discussion  --
> >> > >>>> http://lists.phillylinux.org/mailman/listinfo/plug
> >> > >>>>
> >> > >>>
> >> > >>>
> >> > >>>
> >> > >>>
> >> ___________________________________________________________________________
> >> > >>> Philadelphia Linux Users Group         --
> >> > >>> http://www.phillylinux.org
> >> > >>> Announcements -
> >> > >>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> >> > >>> General Discussion  --
> >> > >>> http://lists.phillylinux.org/mailman/listinfo/plug
> >> > >>>
> >> > >>>
> >> > >>
> >> > >
> >> > >
> >> ___________________________________________________________________________
> >> > > Philadelphia Linux Users Group         --
> >> > > http://www.phillylinux.org
> >> > > Announcements -
> >> > > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> >> > > General Discussion  --
> >> > > http://lists.phillylinux.org/mailman/listinfo/plug
> >> > >
> >> > >
> >>
> >> >
> >> ___________________________________________________________________________
> >> > Philadelphia Linux Users Group         --
> >> http://www.phillylinux.org
> >> > Announcements -
> >> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> >> > General Discussion  --
> >> http://lists.phillylinux.org/mailman/listinfo/plug
> >>
> >>
> >> --
> >> Gavin W. Burris
> >> Senior Project Leader for Research Computing
> >> The Wharton School
> >> University of Pennsylvania
> >> Search our documentation: http://research-it.wharton.upenn.edu/about/
> >> Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
> >>
> >> ___________________________________________________________________________
> >> Philadelphia Linux Users Group         --
> >> http://www.phillylinux.org
> >> Announcements -
> >> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> >> General Discussion  --
> >> http://lists.phillylinux.org/mailman/listinfo/plug
> >>
> >
> >
> > ___________________________________________________________________________
> > Philadelphia Linux Users Group         --
> > http://www.phillylinux.org
> > Announcements -
> > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > General Discussion  --
> > http://lists.phillylinux.org/mailman/listinfo/plug
> >
> > ___________________________________________________________________________
> > Philadelphia Linux Users Group         --
> > http://www.phillylinux.org
> > Announcements -
> > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > General Discussion  --
> > http://lists.phillylinux.org/mailman/listinfo/plug
> >
> >

> ___________________________________________________________________________
> Philadelphia Linux Users Group         --        http://www.phillylinux.org
> Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
> General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug


-- 
Gavin W. Burris
Senior Project Leader for Research Computing
The Wharton School
University of Pennsylvania
Search our documentation: http://research-it.wharton.upenn.edu/about/
Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

Follow-Ups:
- Re: [PLUG] >32K concurrent processes
  - From: "Gavin W. Burris" <bug@wharton.upenn.edu>

References:
- [PLUG] >32K concurrent processes
  - From: "K.S. Bhaskar" <bhaskar@bhaskars.com>
- Re: [PLUG] >32K concurrent processes
  - From: Steve Litt <slitt@troubleshooters.com>
- Re: [PLUG] >32K concurrent processes
  - From: "K.S. Bhaskar" <bhaskar@bhaskars.com>
- Re: [PLUG] >32K concurrent processes
  - From: Rohit Mehta <ro@paper-mill.com>
- Re: [PLUG] >32K concurrent processes
  - From: Rohit Mehta <ro@paper-mill.com>
- Re: [PLUG] >32K concurrent processes
  - From: "K.S. Bhaskar" <bhaskar@bhaskars.com>
- Re: [PLUG] >32K concurrent processes
  - From: "Gavin W. Burris" <bug@wharton.upenn.edu>
- Re: [PLUG] >32K concurrent processes
  - From: "K.S. Bhaskar" <bhaskar@bhaskars.com>
- Re: [PLUG] >32K concurrent processes
  - From: "Keith C. Perry" <kperry@daotechnologies.com>
- Re: [PLUG] >32K concurrent processes
  - From: "K.S. Bhaskar" <bhaskar@bhaskars.com>

Prev by Date: Re: [PLUG] >32K concurrent processes
Next by Date: Re: [PLUG] >32K concurrent processes
Previous by thread: Re: [PLUG] >32K concurrent processes
Next by thread: Re: [PLUG] >32K concurrent processes
Index(es):
- Date
- Thread