Reporting is blank but widget works (TrueNAS 13.0)

hjarnek

Cadet
Joined
Dec 1, 2022
Messages
9
Hello!

My steaming fresh installation of TrueNAS Core 13.0-U3.1 shows null/zero/nada/niente on all reporting graphs, but the CPU and memory widgets on the dashboard work fine. As I've understood from this forum, this has been a common issue for a very long time, but was allegedly fixed in version 12.0-U8. Still, here I am. I am using Firefox, but the problem exists in Chrome as well. I have cleared browser cache multiple times, tried in private browser sessions, restarted collectd with "service collectd onerestart" – nothing works. There are apparently more people who still have this problem (e.g. here and here). Are there any solutions?
 

hjarnek

Cadet
Joined
Dec 1, 2022
Messages
9
No one has a clue? Really? Are we supposed to do without reporting graphs, or are we supposed not to use TrueNAS versions above 12.0-U6?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
What did the developers say in response to your bug report? Because of the way that users have a diverse mix of browsers, configurations, and extensions, your web browser is not going to be the same as everyone else's. If no one else is seeing your problem, then it probably requires some investigation.
 

hjarnek

Cadet
Joined
Dec 1, 2022
Messages
9
What did the developers say in response to your bug report? Because of the way that users have a diverse mix of browsers, configurations, and extensions, your web browser is not going to be the same as everyone else's. If no one else is seeing your problem, then it probably requires some investigation.
Well, judging by other threads in these forums, like the ones I linked, it seems more people are indeed experiencing the same problem and have done so for a long time. I am not a developer, but I managed to figure out how to create a Jira account and submit a bug report. Still no response after three days though. I don't know if I'm supposed to do anything more, but I hope my efforts to bring this issue to attention will not lead to nothing.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well, judging by other threads in these forums, like the ones I linked, it seems more people are indeed experiencing the same problem and have done so for a long time.

Speaking as someone who reads a large percentage of the posts here in the forums, I don't see either of those things as true. What usually happens is that some new version of whatever widgettoolset they use is updated, something breaks, some people using some specific browser breaks, a few complaints show up, and we refer them to Jira. The other alternative is that collectd or related stats stuff gets borked under some specific conditions, and this hoses things for a larger set of people who happen to be running things that tickle that. Reliably gathering stats in UNIX is a bit twitchy. The developers are integrating code written by others into the project, and it is simply a fact that things can break. In the past, they even had to jettison an entire misadventure down a trail of bad choices known as FreeNAS CORRAL.

Still no response after three days though.

It is quite possible that you won't get any response until someone is tasked with looking at the issue. That, in turn, might happen in an hour but might also never happen, depending on various factors such as whether or not they need more details from you, have already isolated and corrected the problem, are seeing your particular case as an outlier, or just don't have the developer resources to assign to it.

I don't know if I'm supposed to do anything more, but I hope my efforts to bring this issue to attention will not lead to nothing.

It's hard to say. A lot of it revolves around the reproducibility of the issue. Obviously it isn't broken for most people, which is the conundrum developers face when working with a complex software stack like this.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
A lot of it revolves around the reproducibility of the issue. Obviously it isn't broken for most people, which is the conundrum developers face when working with a complex software stack like this.
Doing software development in my "day job" I can only second this. Providing enough information to reproduce an issue is often much more difficult than most people think. It begins with the obvious things (version of the software), goes over to somewhat subtle aspects (e.g. what @jgreco mentioned about browser extensions), and often ends with the exact(!) click-path someone took. In many cases it even makes a difference whether or not a certain functionality was accessed as the very first action after the login, or not.

In terms of expectation management, from where I stand there is nothing to complain about after 3 day of waiting. I have been there as well and on a personal level understand that it is frustrating. But there is no SLA and by that legal aspect attached to the ticket. So of course paying customers will be dealt with first. Not sure that reduces the pain. But on the other hand we got a very serious storage system for free, while other vendors charge 6 or 7 digit figures for something more or less comparable.
 

hjarnek

Cadet
Joined
Dec 1, 2022
Messages
9
What usually happens is that some new version of whatever widgettoolset they use is updated, something breaks, some people using some specific browser breaks, a few complaints show up, and we refer them to Jira.

Well, in this case neither of these apply as it's a totally fresh installation, everything is still at "factory defaults" - no pools have been created and no configurations have been made yet. And the problem exists in all major web browsers. I may also add that it worked fine when I tried with an old FreeNAS ISO instead, so it doesn't seem like a hardware problem.

A lot of it revolves around the reproducibility of the issue. Obviously it isn't broken for most people, which is the conundrum developers face when working with a complex software stack like this.
Providing enough information to reproduce an issue is often much more difficult than most people think. It begins with the obvious things (version of the software), goes over to somewhat subtle aspects (e.g. what @jgreco mentioned about browser extensions), and often ends with the exact(!) click-path someone took. In many cases it even makes a difference whether or not a certain functionality was accessed as the very first action after the login, or not.

Considering the above, reproducing the issue should be dead easy: 1. Install latest version of TrueNAS. 2. Go to the reporting tab.
Obviously, far from everyone is experiencing this problem, but there are some of us who do (not just me). Here's another example, a fairly recent post by someone experiencing the same issue in SCALE: https://www.truenas.com/community/threads/reporting-page-is-blank-in-truenas-scale.103564/ And here's a bug report for SCALE filed a couple of months earlier: https://ixsystems.atlassian.net/browse/NAS-116687. Both without solution. My point is that this is not an outlier case, and that debug logs are probably the only thing that will bring this closer to a solution. Which I will be happy to provide, if only I get the question.

In terms of expectation management, from where I stand there is nothing to complain about after 3 day of waiting. I have been there as well and on a personal level understand that it is frustrating. But there is no SLA and by that legal aspect attached to the ticket. So of course paying customers will be dealt with first. Not sure that reduces the pain. But on the other hand we got a very serious storage system for free, while other vendors charge 6 or 7 digit figures for something more or less comparable.

Of course I am not expecting any legal rights for a free-to-use product. But I'm expecting that any efforts to help, in this case by reporting a bug, are reciprocated :) I'm well aware though that developer resources are limited.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well, in this case neither of these apply as it's a totally fresh installation, everything is still at "factory defaults" - no pools have been created and no configurations have been made yet.

Your Windows is also a fresh install with a totally default browser? You've never gone to any other websites on the Internet?

You're using standard modern X11 or X12 based Supermicro hardware? Or something more unusual? Conspicuously missing from your original post.

So many factors to consider.

it doesn't seem like a hardware problem.

Unfortunately, software is very malleable, and things change. Fixes intended to cause one problem happen to also break others. I recently found that I could no longer boot FreeBSD virtual machines with less than 128MB RAM, or certain combinations of VM hardware and FreeBSD versions.

reproducing the issue should be dead easy: 1. Install latest version of TrueNAS. 2. Go to the reporting tab.

Okay. I've got the latest version of TrueNAS Core installed on one production machine, and Scale on a test host so that I could write solnet-array-test-v3 (otherwise I might not ever run the thing). Both work. Now what?

fairly recent post by someone experiencing the same issue in SCALE: https://www.truenas.com/community/threads/reporting-page-is-blank-in-truenas-scale.103564/ And here's a bug report for SCALE filed a couple of months earlier: https://ixsystems.atlassian.net/browse/NAS-116687. Both without solution.

So, two examples of the problem, versus tens of thousands of installed instances.

Years ago, I had a client in the form of a small Internet technology company that was writing their own cloud management system. I found myself consistently amazed by the sheer number of arcane bits of knowledge the web frontend guys had, about what worked and what was supposed to work but didn't. One day one of them gave me a hint that has kinda terrified me in the years since: look at the HTML. Oftentimes, developers put little notes in about the things that do or don't work. Some of the comments I see in HTML content suggest terrifying amounts of only vaguely HTML compatibility, such as this jewel of an incomprehensible line:

Code:
<!-- Progressive Enhancements : END --><!-- What it does: Makes background images in 72ppi Outlook render at correct size. --><!--[if gte mso 9]><!-- The title tag shows in email notifications, like Android 4.4. --><!-- Web Font / @font-face : BEGIN --><!-- NOTE: If web fonts are not required, lines 10 - 27 can be safely removed. --><!-- Desktop Outlook chokes on web font references and defaults to Times New Roman, so we force a safe fallback font. --><!--[if mso]>


I'm sure that some of this might make sense to a web developer, but it just shows how much work goes into customizing for platforms like Outlook or Android. I have a huge amount of respect for people like @ChrisRJ who can work in the modern web environment. And I know it is equally hard to work on the backend of that, transforming the stuff that hardware emits into a more comprehensible and usable format for the end user.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
Considering the above, reproducing the issue should be dead easy: 1. Install latest version of TrueNAS. 2. Go to the reporting tab
Just a few example points why it may not be that easy:
  1. Installation
    • Locale setting (not sure TrueNAS offers this, but it is meant as an example and often the reason why things fail on some systems only)
    • Boot device file system details
    • Pool setup details
  2. Going to reporting
    • Browser version, plugins, non-default settings
    • Workstation OS incl. non-default fonts installed
    • Language settings of workstation
The tricky thing, and that is what I am trying to convey with those examples, is to identify the difference between your system and others. There are situations when this is relatively obvious, and those cases are easy to fix then. Overall, it is a bit like going to a doctor with an "obscure" disease. The diagnosis is often more difficult than devising a therapy.

Coming back to software: Just like in the health context, sometimes the symptom and the root cause can only be connected via a chain of multiple intermediate causes. I once had a case where two systems from the same vendor were designed to work on the same business situation (think of something like processing an order as an example). Those were client-server applications with fat clients running on Windows. The two clients never talked to each directly but would set a flag on their own server to indicate that the order could be picked up by the other server for the next step.

This had been tested not by thousands of installations, but it was certainly more than 100. However, on our setup the hand-off never worked, duplicates were created, which then caused consistency errors down the road. It took a whole week and more than 200 person hours to isolate the culprit: One of the clients (not a server) had a non-US locale setting for its Java Virtual Machine (automatically done by the Java installer, based on the OS settings), which somehow affected a string comparison. So two strings that were used as global identifiers (aka UUIDs) were considered to be different, although they were actually identical (and no, they did not contain non-ASCII characters). As a consequence records could not be matched and things went south from there ...

I have a huge amount of respect for people like @ChrisRJ who can work in the modern web environment. And I know it is equally hard to work on the backend of that, transforming the stuff that hardware emits into a more comprehensible and usable format for the end user.
Thanks, @jgreco , but my web knowledge is actually rather limited. In fact I am primarily working on the back-end side and specifically distributed systems. Those, because of the by definition unreliable nature of connections, have some fascinating challenges.

In closing: Why am I writing all this? I am a developer by heart and almost all developers I know take a lot of pride in doing a good job, whatever their practical experience and formal seniority level. So "we" are generally upset and frustrated if someone like you, @hjarnek, run into issues like you described. But unfortunately, those issues are typically very hard to diagnose, especially when communication happens only via an issue tracker and not something like a video call with screen sharing.

My personal recommendation would be to do a fresh install and write down literally every click and key stroke you make. With luck, the problem goes away, because sub-consciously you do something just a tiny little bit differently. If the issue persists, you have a very comprehensive log that can be used to reproduce things, or at least exclude a lot of factors.

Hope that helps!
 

hjarnek

Cadet
Joined
Dec 1, 2022
Messages
9
@jgreco, I'm not sure what your point is, other than that coding is complicated. I know enough to be well aware of that fact. All I'm saying is that this problem I'm having is not an odd one-off and it's not browser-specific, and considering that it seems to have been present for such a long time, I would expect some interest in solving it, which it worryingly enough almost seems you want to say I should not. I understand that the vast majority do not have this problem, and therefore I would be happy to help with any requested debug logs. In any case I really appreciate your and @ChrisRJ's interest in this topic. Cheers.


@ChrisRJ, Thanks for an interesting story. Imagine all the headaches we could avoid with better international standards. In my case I'm located in Sweden, but as I said, I haven't configured locale settings nor any other settings or pools yet. The machine has not even been connected to the internet, just a local network. It's just as vanilla TrueNAS as it can be. I am open to video calls as well if someone from iXSystems would prefer that. Just hoping someone will get in contact. Cheers.
 
Top