Battling the Lag Monster

Moderators: ALFA Administrators, NWN1 - WD DM

User avatar
ayergo
Penguin AKA Vile Sea Tiger
Posts: 3503
Joined: Sun Jan 11, 2004 8:50 pm
Location: Germany (But frequent world travels)

Battling the Lag Monster

Post by ayergo »

One of the toughest beasties of them all, the Lag Monster has once again reared its head and attacked our beloved Waterdeep home. I've had a lot of personal inquiries and suggestions in the last week so I thought it best to consolidate information here so that I don't have to re-explain it. I do genuinely appreciate that folks would like to help resolve the issue, however without an in depth understanding of the mod and its history its difficult for people to make sensible suggestions. I apologize for being curt in those replies and I realize that the problem is a lack of communication from my part, which I hope to alleviate here.

Background:

The Waterdeep module is one of ALFA's oldest, and I have the pleasure of saying I got to be one of the original testers when Indio was first assembling it. It was passed to El Chip not long after who really put some scripting magic in and brought the place to life. The module and ALFA in general have been passed through many hands since then with varying degrees of technical competency. This has led to a mis-mash of things and lots of legacy stuff that is in need of fixing. There was never really a unified "plan" for things, people just built things for their own particular group and never bothered to clean up other things along the way. This, combined with a lack of basic coding principles, compounded many issues into the final "official" state of the mod that I began work on in 2017.

I have managed to make slow improvements over time in the module to increase its stability. When we first launched in May 2017 we had a hard time with even 5 players on at a time and even then the server needed reboot every day or so. Eventually I tracked the issue to a pseudo-heartbeat script inside of an actual heartbeat script, putting a huge load upon the server. Correcting that got us to be stable up to like 5 people or so, which was a major hurdle to overcome.

From there we held pretty well, and eventually started growing. Again around 12 people or so we started having problems again and the lag monster returned. This time it became clear that the log file was ballooning up to rather large file sizes very quickly. The log files were regularly 100MB of pure text which naturally caused a lot of I/O slowdown as it tried to write to the file. I tracked down the responsible scripts and reduced the output, which in the end worked and got us safely up to 15 or so. Quite a large number now!

Current Situation:

We peaked around 23 people on Friday, wow! Such huge turnout which is really inspiring. Unfortunately the server performance took a big hit handling that many players. A lot of things in old ALFA were never built with scalability in mind, and actually in old ALFA 1 it was rare to see more than 10 players on a server at any given time so probably no one ever even noticed it. In a lot of ways we have outgrown the old implementations of yore and need new systems in place to handle core functionality. You see, in the old ALFA zealotry of hunting down PGers they made complicated systems to track and watch players wealth, which in the end made the game unplayable in more ways than one. To this they tied poorly implemented sub-systems such as horses and subraces. All of these are add up a lot as we get more and more players on at the same time.

To put it simply, the problem isn't finding what causes the lag, its how to fix it without breaking everything else. I'm fairly confident I can make a few adjustments without editing the haks, but you might start seeing some weird behavior or features getting lost. I think some things, like subraces, were not well implemented nor popular and can be cut using a DM to adjust stats instead. Additionally I will probably take horses out entirely (as every player gets a horse heartbeat every cycle) until Duck or someone else can fix the system. Hopefully this will go a long way to making 20+ players a more stable environment.

In short, the problem isn't identifying the issue, the problem is fixing the issue without breaking everything else.

FAQ:

1)"But what about X? I'm an expert in X and I think the problem is here." - You are right! That is a source of lag, however its not the main and most pressing source of lag we currently face. Additionally, finding sources of lag isn't really the problem. The Waterdeep module is a "target rich" environment for finding problems. I do however get a lot of weird ideas from folks on where they think the lag comes from, often which borders on near religious fervor. Sadly we are our own worst enemy and deciding work priorities by committee as we did in the past is a sure way to fail. While your suggestions have good intent, the end result is a distraction from higher priority fixes. The real issue isn't in identification, its in correction.

2)"Can't we just buy a beefier server?" - Sadly no. At the height of our issues last week we had about 30% processor usage on one core (the other core idle) and about 35% memory usage. Network capacity is also underutilized and is probably one of the faster internet connections I've seen in my life, pulling my 100MB updates from google in less than a second. I use an external cloud based VM host from softsys who has had excellent support and at $20 a month I couldn't be happier with their service.

3)"What can I do to help?" - Be patient for the solution! This has been a much more successful project than I anticipated, however at the end of the day I'm just one dude sitting around in his underwear writing code as some perverse idea of "fun". My time is somewhat limited by the fact that I actually manage several different (non NWN) projects as well on the side, in addition to work and RL demands and of course the desire to run cool stories and events as well. A glutton for punishment you might say, but if you feel like being crazy with me then learn to code and put out some quality stuff. There's no shortage of scripts that need rework and I'm sure we can find something that interests you, even easy stuff to start.
There's a place I like to hide
A doorway that I run through in the night
Relax child, you were there
But only didn't realize and you were scared
It's a place where you will learn
To face your fears, retrace the years
And ride the whims of your mind
User avatar
Mick
Beholder
Posts: 1946
Joined: Mon May 30, 2005 2:19 am
Location: Why do you want to know?

Re: Battling the Lag Monster

Post by Mick »

Thanks for the update and for your efforts!
Talk less. Listen more.

Current PCs: ?
User avatar
Stormbringer
Owlbear
Posts: 587
Joined: Mon Jan 05, 2004 6:45 am
Location: USA GMT - 6

Re: Battling the Lag Monster

Post by Stormbringer »

Thanks for all your effort!
Current PC:
Former PC's
Saman Barb/Sorcerer
Kal Rogue/Ranger of Selune
Aiden Ketter Priest of Kelemvor
Kree (ubber not smart Barb)
Past PC: Jena Steel | Hamar Marrion (Marcus)and many other dead PC's
User avatar
oldgrayrogue
Retired
Posts: 3284
Joined: Thu Jan 24, 2008 7:09 am
Location: New York
Contact:

Re: Battling the Lag Monster

Post by oldgrayrogue »

The Penguin rules. Praise him with offerings of XP.
jmecha
Illithid
Posts: 1699
Joined: Mon Nov 15, 2004 4:22 pm
Location: Chicago
Contact:

Re: Battling the Lag Monster

Post by jmecha »

I feel obligated to post a suggestion about how to fix the lag, but I am not knowledgeable enough to sound convincing. Only my ignorance has spared you.
Current Characters: Aelenta Renvanith
User avatar
Wynna
Dungeon Master
Posts: 5734
Joined: Sat Jan 03, 2004 10:09 am
Location: Seattle, WA (PST)

Re: Battling the Lag Monster

Post by Wynna »

Very useful to know all this Ayergo. Thank you.
Enjoy the game
Magile
Otyugh
Posts: 920
Joined: Wed Jan 07, 2004 7:00 pm
Location: The Big Nowhere

Re: Battling the Lag Monster

Post by Magile »

ayergo wrote: Additionally I will probably take horses out entirely (as every player gets a horse heartbeat every cycle) until Duck or someone else can fix the system.
May I inquire what this means, exactly? So there's a script firing for riding horses for every PC on the server, even if they aren't riding a horse?
Part of ALFA since May 2000.
NWN 2 PC (BG): Layali Mae (Arcane Trickster)
NWN 2 PC (MS): Marius Lobhdain (Druid)
Curmudgeon in IRC wrote:(2:29:40 PM) Curmudgeon: The community wants 24/7 DM coverage, free xp, and a suit of mithral plate mail in every pchest.
User avatar
wvincenti
Rust Monster
Posts: 1129
Joined: Mon Jan 05, 2004 5:32 pm
Location: NJ, USA (GMT -5)

Re: Battling the Lag Monster

Post by wvincenti »

Thank you!!!

-Bill
  • Currently NWN1 ALFA: Ryld Ky'bler
    Currently NWN2: Gwindor Faelivrin, still not actually dead!

    Formerly: Timyin Tim, Glorfindel Inglorion and Beleg Thalionestel amongst others.
User avatar
Duck One
Orc Champion
Posts: 423
Joined: Mon Jan 05, 2004 1:51 am
Location: Indiana (EST)
Contact:

Re: Battling the Lag Monster

Post by Duck One »

Additionally I will probably take horses out entirely (as every player gets a horse heartbeat every cycle) until Duck or someone else can fix the system.


I looked at the scripts for horses. There is logic in there for the horse to "perceive a threat" and panic, possibly throw the rider, and then run away from the threat. Kind of a cool idea, unless you consider the NWN engine and all the stuff it does. The scripts also "root" the horses when you tie them up by applying a permanent "tangled" effect, meaning the horse can still perceive a threat (or try to) and want to run. This appears to me to put an infinite number of checks and timers whenever a "tied up horse" comes in contact with a predator.

Knowing this game engine, I would have opted for a simpler solution, treating them more like familiars which the user must instruct how to act. A bit lower on the "coolness" factor, but much less taxing on the game engine. I agree with Ayergo. No amount of horsing around (pun intended) is worth any additional lag. I'm not sure how many other things are being done "on heartbeat", but they should be few and far between if performance is to be sustained.

I think Ayergo's assessment of the logging as another possible culprit is likely a rational one. Writing to text in an infinitely growing file is inefficient and should be done sparingly.

I appreciate the team's efforts to deliver a good performing server.
Duck One

Some guy who used to do some work 'round here.
User avatar
ayergo
Penguin AKA Vile Sea Tiger
Posts: 3503
Joined: Sun Jan 11, 2004 8:50 pm
Location: Germany (But frequent world travels)

Re: Battling the Lag Monster

Post by ayergo »

Magile wrote:
ayergo wrote: Additionally I will probably take horses out entirely (as every player gets a horse heartbeat every cycle) until Duck or someone else can fix the system.
May I inquire what this means, exactly? So there's a script firing for riding horses for every PC on the server, even if they aren't riding a horse?
Correct
There's a place I like to hide
A doorway that I run through in the night
Relax child, you were there
But only didn't realize and you were scared
It's a place where you will learn
To face your fears, retrace the years
And ride the whims of your mind
User avatar
Cast_No_Shadow
Wyvern
Posts: 861
Joined: Wed Jan 07, 2004 3:24 pm

Re: Battling the Lag Monster

Post by Cast_No_Shadow »

For those not in the know. Heartbeat scripts should be treated like hallowed ground. Anything in there runs for every active area, pc, item, npc and monster every 6 seconds. That adds up quickly in terms of load.
User avatar
Galadorn
Haste Bear
Posts: 2483
Joined: Sat Feb 07, 2004 9:10 am
Location: Hefei, China

Re: Battling the Lag Monster

Post by Galadorn »

Thank you Ayergo! (and Mick, and CNS)
Rumple C
Bard
Posts: 3561
Joined: Thu Jul 22, 2004 9:38 pm
Location: The ceiling.

Re: Battling the Lag Monster

Post by Rumple C »

ayergo wrote:I'm just one dude sitting around in his underwear
Pics plz
12.August.2015: Never forget.
User avatar
Senor T
Ogre
Posts: 629
Joined: Mon Jan 05, 2004 7:42 pm
Location: Durham, NC (US Eastern)
Contact:

Re: Battling the Lag Monster

Post by Senor T »

I WAS RIGHT! IT IS INVISIBLE HORSES!
Currently laying the smackdown on Faerun as: Keryn Tel'Jora, who is XXX-TREME!!!.
Currently explaining the meaninglessness of it all as Vizian Nazyr.
Currently pointing out all other characters' shortcomings as Stephen the Archer.
User avatar
ayergo
Penguin AKA Vile Sea Tiger
Posts: 3503
Joined: Sun Jan 11, 2004 8:50 pm
Location: Germany (But frequent world travels)

Re: Battling the Lag Monster

Post by ayergo »

Rumple C wrote:
ayergo wrote:I'm just one dude sitting around in his underwear
Pics plz
That costs extra!
There's a place I like to hide
A doorway that I run through in the night
Relax child, you were there
But only didn't realize and you were scared
It's a place where you will learn
To face your fears, retrace the years
And ride the whims of your mind
Post Reply