K.S. Bhaskar on 18 Apr 2016 08:49:24 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] >32K concurrent processes


Thanks for the suggestions, Gavin, but batching the load won't work in this case. We're trying to run a workload that simulates a large number of concurrent users (as you might find at a large financial or healthcare institution) all of whom expect the system to respond immediately when they ask it to do something. I intend to play with the scheduler.

Regards
-- Bhaskar


On Mon, Apr 18, 2016 at 9:13 AM, Gavin W. Burris <bug@wharton.upenn.edu> wrote:
Good morning, Bhaskar.

Have you considered using /dev/shm aka tmpfs for shared memory on Linux?  Maybe stage all required files there and make sure you are read-only where possible.

With so many processes, your system is just constantly changing threads.  Assuming you are not oversubscribing RAM (32GB / 32k is less than 1MB per), you will want to tune the kernel scheduler.
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-CPU-Configuration_suggestions.html#sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Configuration_suggestions-Tuning_scheduling_policy

This very much sounds like an HPC problem (high-performance computing), so my initial reaction is why not use a resource manager tuned for high-throughput?  Take a look at Open Grid Scheduler (http://gridscheduler.sourceforge.net/), an open source version of Grid Engine.  This will give you a layer of control, a job queue, where you could then do a task array.  Maybe you could launch 1000 jobs that iterate 320 times?  The job queue could then be tuned to not overload the system and keep the system maximally / optimally utilized, aka don't run everything at once but place it in a queue that runs through what you need as resources are available.  I would strongly consider using Grid Engine, expecially given your statement that the procs "do a teeny bit of activity every 10 seconds."

Cheers.

On Sun 04/17/16 11:12AM EDT, K.S. Bhaskar wrote:
> Thanks for the links Rohit. I'll check them out. The storage is SSD, the
> processes do minimal IO - I'm just trying to establish the ability to have
> a file open by more than 32K processes, and I'm clearly running into a
> system limit. This is a development machine (16 cores, 32GB RAM - the
> production machine has something like 64 cores and 512GB RAM), but I can't
> get you access to poke around because it is inside a corporate network.
>
> However, as the software is all open source, I can easily help you get set
> up to poke around using your own system, if you want. Please let me know.
>
> Regards
> -- Bhaskar
>
>
> On Sun, Apr 17, 2016 at 10:54 AM, Rohit Mehta <ro@paper-mill.com> wrote:
>
> > Some kernel parameters to research (which may not be right for your
> > application)
> >
> > https://www.debian-administration.org/article/656/Installing_Oracle11_and_Oracle12_on_Debian_Wheezy_Squeeze
> > and /etc/security.conf changes
> > http://stackoverflow.com/questions/9361816/maximum-number-of-processes-in-linux
> >
> > Do these process do a lot of IO?  Is your storage rotational media or
> > SSD?  Can your application run off ramdisk storage?  Have you tried
> > enabling hyperthreading?
> >
> > Do you have the ability to test application loads non-production system?
> > If so i'd be interesting in helping you poke around.  It might be an
> > education for me.
> >
> >
> > On Sun, Apr 17, 2016 at 10:42 AM, Rohit Mehta <ro@paper-mill.com> wrote:
> >
> >> Back many years ago, I installed Oracle on my Debian workstation for fun,
> >> and I remember the guide had a lot of tweaks.  "ulimit" is one that I can
> >> think of, but I don't remember them all.  I'm poking around the internet to
> >> see if I can find the oracle guide (although it might not be relevant on
> >> newer kernels)
> >>
> >> On Sun, Apr 17, 2016 at 10:27 AM, K.S. Bhaskar <bhaskar@bhaskars.com>
> >> wrote:
> >>
> >>> Thanks Steve, but in this case we have a customer need to crank up the
> >>> number of processes on Linux.
> >>>
> >>> Regards
> >>> -- Bhaskar
> >>>
> >>> On Sat, Apr 16, 2016 at 4:09 PM, Steve Litt <slitt@troubleshooters.com>
> >>> wrote:
> >>>
> >>>> On Fri, 15 Apr 2016 17:40:09 -0400
> >>>> "K.S. Bhaskar" <bhaskar@bhaskars.com> wrote:
> >>>>
> >>>> > I am trying to crank up more than 32K concurrent processes (the
> >>>> > processes themselves hang and do a teeny bit of activity every 10
> >>>> > seconds). But the OS (64-bit Debian 8 - Jessie) stubbornly refuses to
> >>>> > crank up beyond 32K-ish processes. pid_max is set to a very large
> >>>> > number (1M), so that's not it. Any suggestions on what limits to look
> >>>> > for appreciated. Thank you very much.
> >>>>
> >>>> This is old information, but back in the day people who wanted lots and
> >>>> lots of processes used one of the BSDs to host that server.
> >>>>
> >>>> SteveT
> >>>>
> >>>> Steve Litt
> >>>> April 2016 featured book: Rapid Learning for the 21st Century
> >>>> http://www.troubleshooters.com/rl21
> >>>>
> >>>> ___________________________________________________________________________
> >>>> Philadelphia Linux Users Group         --
> >>>> http://www.phillylinux.org
> >>>> Announcements -
> >>>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> >>>> General Discussion  --
> >>>> http://lists.phillylinux.org/mailman/listinfo/plug
> >>>>
> >>>
> >>>
> >>>
> >>> ___________________________________________________________________________
> >>> Philadelphia Linux Users Group         --
> >>> http://www.phillylinux.org
> >>> Announcements -
> >>> http://lists.phillylinux.org/mailman/listinfo/plug-announce
> >>> General Discussion  --
> >>> http://lists.phillylinux.org/mailman/listinfo/plug
> >>>
> >>>
> >>
> >
> > ___________________________________________________________________________
> > Philadelphia Linux Users Group         --
> > http://www.phillylinux.org
> > Announcements -
> > http://lists.phillylinux.org/mailman/listinfo/plug-announce
> > General Discussion  --
> > http://lists.phillylinux.org/mailman/listinfo/plug
> >
> >

> ___________________________________________________________________________
> Philadelphia Linux Users Group         --        http://www.phillylinux.org
> Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
> General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug


--
Gavin W. Burris
Senior Project Leader for Research Computing
The Wharton School
University of Pennsylvania
Search our documentation: http://research-it.wharton.upenn.edu/about/
Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug

___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug