Gavin W. Burris on 18 Apr 2016 13:04:05 -0700 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: [PLUG] >32K concurrent processes |
Hi, Bhaskar. That would definitely lend benefits in some ways. I can't help feeling the architecture is showing its age, with the problem of not being able to scale past a single node efficiently. I guess overloading a single host is still advantageous for problems of a certain size. What is the breaking point where the latency of inter-node communication is acceptable? What use case is driving the decision to so strongly avoid going multi-node? Cheers. On Mon 04/18/16 03:56PM EDT, K.S. Bhaskar wrote: > Gavin, the clients and database are the same. The database logic is inside > application processes, or application logic is inside the database - either > works out to the same thing. That's the trend these days in very high end > databases, except that this has been GT.M's architecture since way back > when. > > Regards > -- Bhaskar > > On Mon, Apr 18, 2016 at 3:36 PM, Gavin W. Burris <bug@wharton.upenn.edu> > wrote: > > > Just saw your previous post, that the 32k is for the testing clients, not > > somehow the database. I'd just spin up more boxes, once you find the > > optimal number of clients a single one can handle. Keep us posted. This > > is neat stuff. Cheers. > > > > On Mon 04/18/16 03:32PM EDT, Gavin W. Burris wrote: > > > Hi, Bhaskar. > > > > > > This sounds really neat. Why do you need to simultaneously serialize > > ALL transactions? For instance, my bank balance or my medical records have > > absolutely no real-time serial dependencies on any other account. Maybe a > > service free on my medical record is a dependency, but just update those > > daily. My balance may depend on a transfer, but just look at the posted > > timestamp. If one needs a bank-wide report, again, just look at > > transactions to a specific timestamp. What is an acceptable granularity? > > Sure, you can get millisecond accuracy this way, but why would you want > > that given the downsides? Is this some kind of high-frequency trading > > scheme? If so, any further communications will have to be under billable > > hours for my private consulting services. :D > > > > > > Cheers. > > > > > > On Mon 04/18/16 02:44PM EDT, K.S. Bhaskar wrote: > > > > This is not a distributed environment - it's a single system. The > > reason is > > > > transaction serialization. When every transaction can potentially > > depend on > > > > the result of the preceding transaction, the more you can centralize > > > > serialization decision making, the faster you can make decisions > > required > > > > to ensure ACID properties at transaction commit time. With GT.M, this > > > > serialization is done in the shared memory of a single computing node. > > Even > > > > with technologies such as RDMA over Infiniband, IPC between processes > > on > > > > different nodes is one to two orders of magnitude slower than > > processes on > > > > a single node. So, as long as throughput is not constrained by the > > amount > > > > of CPU, RAM, or IO you can put on a single node, centralized > > serialization > > > > gives you the best overall throughput. With GT.M, and with the types of > > > > computer system you can purchase today, the throughput you can achieve > > on a > > > > single node is big enough to handle the needs of real-time > > core-processing > > > > (a core system is the system of record for your bank balance) on just > > about > > > > any bank. The largest real-time core systems in production anywhere in > > the > > > > world today that I know of run on GT.M - these are systems with over 30 > > > > million accounts. In healthcare, the real-time electronic health > > records > > > > for the entire Jordanian Ministry of Health system are being rolled > > out on > > > > a single system (⅓ of the electronic health records for a country with > > the > > > > area and population of Indiana processed on a single system). > > > > > > > > What people think of as a horizontally scalable architecture for a > > > > transactional system is stateless application servers that can be spun > > up > > > > as needed, but which send all the needed state to a database under the > > > > covers. This architecture scales only as well as the database scales > > on a > > > > single node, which is to say not very well - in our testing some years > > ago, > > > > we found that because of transaction serialization, a popular database > > > > scaled better on a single node than across multiple nodes in a cluster. > > > > > > > > So thanks for all the suggestions but for now, the specific > > information I > > > > need is how to configure a Linux system to allow more than 32K > > concurrent > > > > processes. Increasing pid_max is a necessary change, but clearly not a > > > > sufficient change. > > > > > > > > Regards > > > > -- Bhaskar > > > > > > > > > > > > > > > > On Mon, Apr 18, 2016 at 12:19 PM, Keith C. Perry < > > kperry@daotechnologies.com > > > > > wrote: > > > > > > > > > Bhasker, > > > > > > > > > > What's the deployment infrastructure? When you say "We're trying to > > run a > > > > > workload that simulates a large number of concurrent users (as you > > might > > > > > find at a large financial or healthcare institution)", it makes me > > think > > > > > that is more a distributed environment where you need a large number > > of > > > > > clients being serviced by a pool of servers. > > > > > > > > > > If that is the case it sound more like you would need a listener > > (server) > > > > > running that would then fork off or thread child connections to > > respond to > > > > > client requests. This is also something that can be achieved in a > > local > > > > > context the listener on the localhost IP or using unix sockets. > > > > > > > > > > > > > > > ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ > > > > > Keith C. Perry, MS E.E. > > > > > Owner, DAO Technologies LLC > > > > > (O) +1.215.525.4165 x2033 > > > > > (M) +1.215.432.5167 > > > > > www.daotechnologies.com > > > > > > > > > > ------------------------------ > > > > > *From: *"K.S. Bhaskar" <bhaskar@bhaskars.com> > > > > > *To: *"Philadelphia Linux User's Group Discussion List" < > > > > > plug@lists.phillylinux.org> > > > > > *Sent: *Monday, April 18, 2016 11:49:19 AM > > > > > *Subject: *Re: [PLUG] >32K concurrent processes > > > > > > > > > > Thanks for the suggestions, Gavin, but batching the load won't work > > in > > > > > this case. We're trying to run a workload that simulates a large > > number of > > > > > concurrent users (as you might find at a large financial or > > healthcare > > > > > institution) all of whom expect the system to respond immediately > > when they > > > > > ask it to do something. I intend to play with the scheduler. > > > > > > > > > > Regards > > > > > -- Bhaskar > > > > > > > > > > > > > > > On Mon, Apr 18, 2016 at 9:13 AM, Gavin W. Burris < > > bug@wharton.upenn.edu> > > > > > wrote: > > > > > > > > > >> Good morning, Bhaskar. > > > > >> > > > > >> Have you considered using /dev/shm aka tmpfs for shared memory on > > Linux? > > > > >> Maybe stage all required files there and make sure you are > > read-only where > > > > >> possible. > > > > >> > > > > >> With so many processes, your system is just constantly changing > > threads. > > > > >> Assuming you are not oversubscribing RAM (32GB / 32k is less than > > 1MB per), > > > > >> you will want to tune the kernel scheduler. > > > > >> > > > > >> > > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Performance_Tuning_Guide/sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-CPU-Configuration_suggestions.html#sect-Red_Hat_Enterprise_Linux-Performance_Tuning_Guide-Configuration_suggestions-Tuning_scheduling_policy > > > > >> > > > > >> This very much sounds like an HPC problem (high-performance > > computing), > > > > >> so my initial reaction is why not use a resource manager tuned for > > > > >> high-throughput? Take a look at Open Grid Scheduler ( > > > > >> http://gridscheduler.sourceforge.net/), an open source version of > > Grid > > > > >> Engine. This will give you a layer of control, a job queue, where > > you > > > > >> could then do a task array. Maybe you could launch 1000 jobs that > > iterate > > > > >> 320 times? The job queue could then be tuned to not overload the > > system > > > > >> and keep the system maximally / optimally utilized, aka don't run > > > > >> everything at once but place it in a queue that runs through what > > you need > > > > >> as resources are available. I would strongly consider using Grid > > Engine, > > > > >> expecially given your statement that the procs "do a teeny bit of > > activity > > > > >> every 10 seconds." > > > > >> > > > > >> Cheers. > > > > >> > > > > >> On Sun 04/17/16 11:12AM EDT, K.S. Bhaskar wrote: > > > > >> > Thanks for the links Rohit. I'll check them out. The storage is > > SSD, the > > > > >> > processes do minimal IO - I'm just trying to establish the > > ability to > > > > >> have > > > > >> > a file open by more than 32K processes, and I'm clearly running > > into a > > > > >> > system limit. This is a development machine (16 cores, 32GB RAM - > > the > > > > >> > production machine has something like 64 cores and 512GB RAM), > > but I > > > > >> can't > > > > >> > get you access to poke around because it is inside a corporate > > network. > > > > >> > > > > > >> > However, as the software is all open source, I can easily help > > you get > > > > >> set > > > > >> > up to poke around using your own system, if you want. Please let > > me > > > > >> know. > > > > >> > > > > > >> > Regards > > > > >> > -- Bhaskar > > > > >> > > > > > >> > > > > > >> > On Sun, Apr 17, 2016 at 10:54 AM, Rohit Mehta <ro@paper-mill.com> > > > > >> wrote: > > > > >> > > > > > >> > > Some kernel parameters to research (which may not be right for > > your > > > > >> > > application) > > > > >> > > > > > > >> > > > > > > >> > > https://www.debian-administration.org/article/656/Installing_Oracle11_and_Oracle12_on_Debian_Wheezy_Squeeze > > > > >> > > and /etc/security.conf changes > > > > >> > > > > > > >> > > http://stackoverflow.com/questions/9361816/maximum-number-of-processes-in-linux > > > > >> > > > > > > >> > > Do these process do a lot of IO? Is your storage rotational > > media or > > > > >> > > SSD? Can your application run off ramdisk storage? Have you > > tried > > > > >> > > enabling hyperthreading? > > > > >> > > > > > > >> > > Do you have the ability to test application loads non-production > > > > >> system? > > > > >> > > If so i'd be interesting in helping you poke around. It might > > be an > > > > >> > > education for me. > > > > >> > > > > > > >> > > > > > > >> > > On Sun, Apr 17, 2016 at 10:42 AM, Rohit Mehta < > > ro@paper-mill.com> > > > > >> wrote: > > > > >> > > > > > > >> > >> Back many years ago, I installed Oracle on my Debian > > workstation for > > > > >> fun, > > > > >> > >> and I remember the guide had a lot of tweaks. "ulimit" is one > > that > > > > >> I can > > > > >> > >> think of, but I don't remember them all. I'm poking around the > > > > >> internet to > > > > >> > >> see if I can find the oracle guide (although it might not be > > > > >> relevant on > > > > >> > >> newer kernels) > > > > >> > >> > > > > >> > >> On Sun, Apr 17, 2016 at 10:27 AM, K.S. Bhaskar < > > bhaskar@bhaskars.com > > > > >> > > > > > >> > >> wrote: > > > > >> > >> > > > > >> > >>> Thanks Steve, but in this case we have a customer need to > > crank up > > > > >> the > > > > >> > >>> number of processes on Linux. > > > > >> > >>> > > > > >> > >>> Regards > > > > >> > >>> -- Bhaskar > > > > >> > >>> > > > > >> > >>> On Sat, Apr 16, 2016 at 4:09 PM, Steve Litt < > > > > >> slitt@troubleshooters.com> > > > > >> > >>> wrote: > > > > >> > >>> > > > > >> > >>>> On Fri, 15 Apr 2016 17:40:09 -0400 > > > > >> > >>>> "K.S. Bhaskar" <bhaskar@bhaskars.com> wrote: > > > > >> > >>>> > > > > >> > >>>> > I am trying to crank up more than 32K concurrent processes > > (the > > > > >> > >>>> > processes themselves hang and do a teeny bit of activity > > every 10 > > > > >> > >>>> > seconds). But the OS (64-bit Debian 8 - Jessie) stubbornly > > > > >> refuses to > > > > >> > >>>> > crank up beyond 32K-ish processes. pid_max is set to a > > very large > > > > >> > >>>> > number (1M), so that's not it. Any suggestions on what > > limits to > > > > >> look > > > > >> > >>>> > for appreciated. Thank you very much. > > > > >> > >>>> > > > > >> > >>>> This is old information, but back in the day people who > > wanted > > > > >> lots and > > > > >> > >>>> lots of processes used one of the BSDs to host that server. > > > > >> > >>>> > > > > >> > >>>> SteveT > > > > >> > >>>> > > > > >> > >>>> Steve Litt > > > > >> > >>>> April 2016 featured book: Rapid Learning for the 21st Century > > > > >> > >>>> http://www.troubleshooters.com/rl21 > > > > >> > >>>> > > > > >> > >>>> > > > > >> > > ___________________________________________________________________________ > > > > >> > >>>> Philadelphia Linux Users Group -- > > > > >> > >>>> http://www.phillylinux.org > > > > >> > >>>> Announcements - > > > > >> > >>>> http://lists.phillylinux.org/mailman/listinfo/plug-announce > > > > >> > >>>> General Discussion -- > > > > >> > >>>> http://lists.phillylinux.org/mailman/listinfo/plug > > > > >> > >>>> > > > > >> > >>> > > > > >> > >>> > > > > >> > >>> > > > > >> > >>> > > > > >> > > ___________________________________________________________________________ > > > > >> > >>> Philadelphia Linux Users Group -- > > > > >> > >>> http://www.phillylinux.org > > > > >> > >>> Announcements - > > > > >> > >>> http://lists.phillylinux.org/mailman/listinfo/plug-announce > > > > >> > >>> General Discussion -- > > > > >> > >>> http://lists.phillylinux.org/mailman/listinfo/plug > > > > >> > >>> > > > > >> > >>> > > > > >> > >> > > > > >> > > > > > > >> > > > > > > >> > > ___________________________________________________________________________ > > > > >> > > Philadelphia Linux Users Group -- > > > > >> > > http://www.phillylinux.org > > > > >> > > Announcements - > > > > >> > > http://lists.phillylinux.org/mailman/listinfo/plug-announce > > > > >> > > General Discussion -- > > > > >> > > http://lists.phillylinux.org/mailman/listinfo/plug > > > > >> > > > > > > >> > > > > > > >> > > > > >> > > > > > >> > > ___________________________________________________________________________ > > > > >> > Philadelphia Linux Users Group -- > > > > >> http://www.phillylinux.org > > > > >> > Announcements - > > > > >> http://lists.phillylinux.org/mailman/listinfo/plug-announce > > > > >> > General Discussion -- > > > > >> http://lists.phillylinux.org/mailman/listinfo/plug > > > > >> > > > > >> > > > > >> -- > > > > >> Gavin W. Burris > > > > >> Senior Project Leader for Research Computing > > > > >> The Wharton School > > > > >> University of Pennsylvania > > > > >> Search our documentation: > > http://research-it.wharton.upenn.edu/about/ > > > > >> Subscribe to the Newsletter: > > http://whr.tn/ResearchNewsletterSubscribe > > > > >> > > > > >> > > ___________________________________________________________________________ > > > > >> Philadelphia Linux Users Group -- > > > > >> http://www.phillylinux.org > > > > >> Announcements - > > > > >> http://lists.phillylinux.org/mailman/listinfo/plug-announce > > > > >> General Discussion -- > > > > >> http://lists.phillylinux.org/mailman/listinfo/plug > > > > >> > > > > > > > > > > > > > > > > > ___________________________________________________________________________ > > > > > Philadelphia Linux Users Group -- > > > > > http://www.phillylinux.org > > > > > Announcements - > > > > > http://lists.phillylinux.org/mailman/listinfo/plug-announce > > > > > General Discussion -- > > > > > http://lists.phillylinux.org/mailman/listinfo/plug > > > > > > > > > > > > ___________________________________________________________________________ > > > > > Philadelphia Linux Users Group -- > > > > > http://www.phillylinux.org > > > > > Announcements - > > > > > http://lists.phillylinux.org/mailman/listinfo/plug-announce > > > > > General Discussion -- > > > > > http://lists.phillylinux.org/mailman/listinfo/plug > > > > > > > > > > > > > > > > > > > ___________________________________________________________________________ > > > > Philadelphia Linux Users Group -- > > http://www.phillylinux.org > > > > Announcements - > > http://lists.phillylinux.org/mailman/listinfo/plug-announce > > > > General Discussion -- > > http://lists.phillylinux.org/mailman/listinfo/plug > > > > > > > > > -- > > > Gavin W. Burris > > > Senior Project Leader for Research Computing > > > The Wharton School > > > University of Pennsylvania > > > Search our documentation: http://research-it.wharton.upenn.edu/about/ > > > Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe > > > > > ___________________________________________________________________________ > > > Philadelphia Linux Users Group -- > > http://www.phillylinux.org > > > Announcements - > > http://lists.phillylinux.org/mailman/listinfo/plug-announce > > > General Discussion -- > > http://lists.phillylinux.org/mailman/listinfo/plug > > > > -- > > Gavin W. Burris > > Senior Project Leader for Research Computing > > The Wharton School > > University of Pennsylvania > > Search our documentation: http://research-it.wharton.upenn.edu/about/ > > Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe > > ___________________________________________________________________________ > > Philadelphia Linux Users Group -- > > http://www.phillylinux.org > > Announcements - > > http://lists.phillylinux.org/mailman/listinfo/plug-announce > > General Discussion -- > > http://lists.phillylinux.org/mailman/listinfo/plug > > > ___________________________________________________________________________ > Philadelphia Linux Users Group -- http://www.phillylinux.org > Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce > General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug -- Gavin W. Burris Senior Project Leader for Research Computing The Wharton School University of Pennsylvania Search our documentation: http://research-it.wharton.upenn.edu/about/ Subscribe to the Newsletter: http://whr.tn/ResearchNewsletterSubscribe ___________________________________________________________________________ Philadelphia Linux Users Group -- http://www.phillylinux.org Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce General Discussion -- http://lists.phillylinux.org/mailman/listinfo/plug