K.S. Bhaskar on 18 Apr 2016 11:48:15 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] >32K concurrent processes


Thanks Gavin. With GT.M real-time database replication, there is no single point of failure. Furthermore, there are hooks to create and deploy applications that remain available not just in the face of unplanned events (like system crashes) but even planned events (such as application upgrades, even many upgrades that involve schema changes). It is a proven architecture which first went into daily live production in 1999.

Regards
-- Bhaskar


On Mon, Apr 18, 2016 at 12:36 PM, Gavin W. Burris <bug@wharton.upenn.edu> wrote:
Hi, Bhaskar.

AH, OK.  I should have asked, "What are you trying to accomplish?"  Don't run everything on one box!  Scale horizontally, with at least two user-facing nodes.  You want to engineer in redundancy from square-one.  If you don't, there will be no ability to sanely handle critical patching/updates, or deal with scaling up.

With Grid Engine, that would be two master hosts, and at least two compute nodes actually running the procs, all with an nfs shared cell directory; The secondary master is called the shadow master in Grid Engine-speak.  Grid Engine would be a good solution if you need to run some existing command-line or batch code.

If this is for web, strongly consider having a redundant API endpoint to run functions.  A good way to do this would be with Docker and Swarm.  Docker is a completely different approach, but one that is correct for scaling web applications.

Cheers.

On Mon 04/18/16 11:49AM EDT, K.S. Bhaskar wrote:
> Thanks for the suggestions, Gavin, but batching the load won't work in this
> case. We're trying to run a workload that simulates a large number of
> concurrent users (as you might find at a large financial or healthcare
> institution) all of whom expect the system to respond immediately when they
> ask it to do something. I intend to play with the scheduler.
>
> Regards
> -- Bhaskar
>
>
> On Mon, Apr 18, 2016 at 9:13 AM, Gavin W. Burris <bug@wharton.upenn.edu>
> wrote:
>
> > Good morning, Bhaskar.
> >
> > Have you considered using /dev/shm aka tmpfs for shared memory on Linux?
> > Maybe stage all required files there and make sure you are read-only where
> > possible.
> >
> > With so many processes, your system is just constantly changing threads.
> > Assuming you are not oversubscribing RAM (32GB / 32k is less than 1MB per),
> > you will want to tune the kernel scheduler.
> >
> > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-CPU-Configuration_suggestions.html#sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Configuration_suggestions-Tuning_scheduling_policy
> >
> > This very much sounds like an HPC problem (high-performance computing), so
> > my initial reaction is why not use a resource manager tuned for
> > high-throughput?  Take a look at Open Grid Scheduler (
> > http://gridscheduler.sourceforge.net/), an open source version of Grid
> > Engine.  This will give you a layer of control, a job queue, where you
> > could then do a task array.  Maybe you could launch 1000 jobs that iterate
> > 320 times?  The job queue could then be tuned to not overload the system
> > and keep the system maximally / optimally utilized, aka don't run
> > everything at once but place it in a queue that runs through what you need
> > as resources are available.  I would strongly consider using Grid Engine,
> > expecially given your statement that the procs "do a teeny bit of activity
> > every 10 seconds."
> >
> > Cheers.
> >
> > On Sun 04/17/16 11:12AM EDT, K.S. Bhaskar wrote:
> > > Thanks for the links Rohit. I'll check them out. The storage is SSD, the
> > > processes do minimal IO - I'm just trying to establish the ability to
> > have
> > > a file open by more than 32K processes, and I'm clearly running into a
> > > system limit. This is a development machine (16 cores, 32GB RAM - the
> > > production machine has something like 64 cores and 512GB RAM), but I
> > can't
> > > get you access to poke around because it is inside a corporate network.
> > >
> > > However, as the software is all open source, I can easily help you get
> > set
> > > up to poke around using your own system, if you want. Please let me know.
> > >
> > > Regards
> > > -- Bhaskar
> > >
> > >
> > > On Sun, Apr 17, 2016 at 10:54 AM, Rohit Mehta <ro@paper-mill.com> wrote:
> > >
> > > > Some kernel parameters to research (which may not be right for your
> > > > application)
> > > >
> > > >
> > https://www.debian-administration.org/article/656/Installing_Oracle11_and_Oracle12_on_Debian_Wheezy_Squeeze
> > > > and /etc/security.conf changes
> > > >
> > http://stackoverflow.com/questions/9361816/maximum-number-of-processes-in-linux
> > > >
> > > > Do these process do a lot of IO?  Is your storage rotational media or
> > > > SSD?  Can your application run off ramdisk storage?  Have you tried
> > > > enabling hyperthreading?
> > > >
> > > > Do you have the ability to test application loads non-production
> > system?
> > > > If so i'd be interesting in helping you poke around.  It might be an
> > > > education for me.
> > > >
> > > >
> > > > On Sun, Apr 17, 2016 at 10:42 AM, Rohit Mehta <ro@paper-mill.com>
> > wrote:
> > > >
> > > >> Back many years ago, I installed Oracle on my Debian workstation for
> > fun,
> > > >> and I remember the guide had a lot of tweaks.  "ulimit" is one that I
> > can
> > > >> think of, but I don't remember them all.  I'm poking around the
> > internet to
> > > >> see if I can find the oracle guide (although it might not be relevant
> > on
> > > >> newer kernels)
> > > >>
> > > >> On Sun, Apr 17, 2016 at 10:27 AM, K.S. Bhaskar <bhaskar@bhaskars.com>
> > > >> wrote:
> > > >>
> > > >>> Thanks Steve, but in this case we have a customer need to crank up
> > the
> > > >>> number of processes on Linux.
> > > >>>
> > > >>> Regards
> > > >>> -- Bhaskar
> > > >>>
> > > >>> On Sat, Apr 16, 2016 at 4:09 PM, Steve Litt <
> > slitt@troubleshooters.com>
> > > >>> wrote:
> > > >>>
> > > >>>> On Fri, 15 Apr 2016 17:40:09 -0400
> > > >>>> "K.S. Bhaskar" <bhaskar@bhaskars.com> wrote:
> > > >>>>
> > > >>>> > I am trying to crank up more than 32K concurrent processes (the
> > > >>>> > processes themselves hang and do a teeny bit of activity every 10
> > > >>>> > seconds). But the OS (64-bit Debian 8 - Jessie) stubbornly
> > refuses to
> > > >>>> > crank up beyond 32K-ish processes. pid_max is set to a very large
> > > >>>> > number (1M), so that's not it. Any suggestions on what limits to
> > look
> > > >>>> > for appreciated. Thank you very much.
> > > >>>>
> > > >>>> This is old information, but back in the day people who wanted lots
> > and
> > > >>>> lots of processes used one of the BSDs to host that server.
> > > >>>>
> > > >>>> SteveT
> > > >>>>
> > > >>>> Steve Litt
> > > >>>> April 2016 featured book: Rapid Learning for the 21st Century
> > > >>>> http://www.troubleshooters.com/rl21
> > > >>>>
> > > >>>>
> > ___________________________________________________________________________
> > > >>>> Philadelphia Linux Users Group         --
> > > >>>> http://www.phillylinux.org
> > > >>>> Announcements -
> > > >>>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > >>>> General Discussion  --
> > > >>>> http://lists.phillylinux.org/mailman/listinfo/plug
> > > >>>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > ___________________________________________________________________________
> > > >>> Philadelphia Linux Users Group         --
> > > >>> http://www.phillylinux.org
> > > >>> Announcements -
> > > >>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > >>> General Discussion  --
> > > >>> http://lists.phillylinux.org/mailman/listinfo/plug
> > > >>>
> > > >>>
> > > >>
> > > >
> > > >
> > ___________________________________________________________________________
> > > > Philadelphia Linux Users Group         --
> > > > http://www.phillylinux.org
> > > > Announcements -
> > > > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > > General Discussion  --
> > > > http://lists.phillylinux.org/mailman/listinfo/plug
> > > >
> > > >
> >
> > >
> > ___________________________________________________________________________
> > > Philadelphia Linux Users Group         --
> > http://www.phillylinux.org
> > > Announcements -
> > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > > General Discussion  --
> > http://lists.phillylinux.org/mailman/listinfo/plug
> >
> >
> > --
> > Gavin W. Burris
> > Senior Project Leader for Research Computing
> > The Wharton School
> > University of Pennsylvania
> > Search our documentation: http://research-it.wharton.upenn.edu/about/
> > Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
> > ___________________________________________________________________________
> > Philadelphia Linux Users Group         --
> > http://www.phillylinux.org
> > Announcements -
> > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > General Discussion  --
> > http://lists.phillylinux.org/mailman/listinfo/plug
> >

> ___________________________________________________________________________
> Philadelphia Linux Users Group         --        http://www.phillylinux.org
> Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
> General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug


--
Gavin W. Burris
Senior Project Leader for Research Computing
The Wharton School
University of Pennsylvania
Search our documentation: http://research-it.wharton.upenn.edu/about/
Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug