-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vmMgr.c VM_RAM not mapped if VRAM_USE_FTL==0 #88
Comments
Sorry that I can't read your post, I wonder if the following text is what you originally intended? The code in mapList_AddPartitionMap(MAP_PART_RAWFLASH, PERM_R, VM_ROM_BASE, FLASH_SYSTEM_BLOCK * 64, VM_ROM_SIZE);
mapList_AddPartitionMap(MAP_PART_FTL, PERM_R | PERM_W, VM_RAM_BASE, 0, VM_RAM_SIZE);
#if (VMRAM_USE_FTL == 0)
uint32_t paddr = (uint32_t)vm_ram_none_ftl;
for (uint32_t vaddr = VM_RAM_BASE; vaddr < (VM_RAM_BASE + VM_RAM_SIZE_NONE_FTL); vaddr += PAGE_SIZE) {
mmu_map_page(vaddr, paddr, AP_SYSRW_USRRW, VM_CACHE_ENABLE, VM_BUFFER_ENABLE);
paddr += PAGE_SIZE;
}
mmu_invalidate_tlb();
//
#endif So that we can use two kinds of ram at the same time: not swapped and swapped. I think that the 1 bit per pixel screen buffer (4K) should never be swapped, leaving 264K for cache. #define USE_TINY_PAGE (1)
#define VMRAM_USE_FTL (0)
#define SEG_SIZE 1048576
#if VMRAM_USE_FTL
#if USE_TINY_PAGE
#ifdef ENABLE_AUIDIOOUT
#define NUM_CACHEPAGE ( 200 ) // 273 * 1 = 273 KB
#else
#define NUM_CACHEPAGE ( 268 ) // 273 * 1 = 273 KB
#endif
#else
#define NUM_CACHEPAGE ( 79 ) // 79 * 4 = 316 KB
#endif
#else
#if USE_TINY_PAGE
#define NUM_CACHEPAGE ( 264 )
#define VM_RAM_SIZE_NONE_FTL ( 4 * 1024 )
#else
#define NUM_CACHEPAGE ( 32 )
#define VM_RAM_SIZE_NONE_FTL ( 168 * 1024 )
#endif
#endif The loader script MEMORY
{
vmRAM (rwx) : ORIGIN = 0x02040000, LENGTH = 2M
vmROM (rx ) : ORIGIN = 0x00100000, LENGTH = 6M
} Changes in kcasporing_gl.c: char * screen_1bpp=0x02000000; and avoid initialization of
Now |
Yes, I tried but could not make github render the code correctly, I don't know why (I have added the vmMgr.c file to my giac39.tgz archive). Maybe we could discuss further optimizations on a phpbb forum like https://tiplanet.org/forum/viewforum.php?f=70 ? I think we could spare a few K in OSLoader. For example in msc_disk.c, the variables |
Found another unusued buffer: pcWriteBuffer in OSLoader/start.c, 5K. |
Can we share a common buffer for page_save_wr_buf, page_save_rd_buf (VmMgr/vmMgr.c) and data_page_buffer (LowLevelAPI/llapi.c)? Potential saving 4K. |
Potential alignment optimizations?
|
Is L1PTE_NUM really #define L1PTE_NUM (2049)? Setting to 2048 would save almost 4K between L1PTE and L2PTE. |
peripheral register address at 0x80000000, we need to set "PTE_LOC[0x800]" to map this segment to virtual address space for driver, so we defined PTE_LOC[2049]. In fact, most of the area (PTE_LOC[49] to PTE_LOC[2048] total about 8KB ) is redundant and could probably be used. |
MSCWRBuf and MSCWRBuf I forgot to remove them, initially for the small sector USB transfers, but now I don't need them... |
If you look at the loader output, the address of L1PTE and L2PTE differs from 12K because L2PTE is 4K aligned. In other words, the 2049-th index is responsible for 4K additional RAM use. If it's unavoidable, maybe it's possible to use 4K-4 bytes for something else. |
I have found a way to move PageFaultQueue and mapList from bss to data and save 1K, just initialize to {0] or 0. Then move the definition of L2PTE at the beginning of vmMgr.c, renamed 0vmMgr.c, and the loader orders the RAM much better
I added 4K to the non swappable usable RAM to 8K, and 4 pages of cache to 268 pages and the rom heap start address is lower than before. |
I moved the L1 page table to a separate space supported by the chip(default first-level page table, DFLPT), which will save 8KB of memory and I trimmed some useless buffers, and now we have about 300KB of physical memory. Here is the new code: I have tested turning memory swapping off and it looks like the UI written by LvGL is difficult to run (maybe we need a simple UI), but it is sufficient for KhiCAS to run, so I set it up to enter KhiCAS immediately after startup (https://github.com/Repeerc/ExistOS-For-HP39GII/blob/main/System/main.c#L1160-L1178). |
Great! |
Unfortunately, my attempts to boot the calculator with this new configuration failed. One of the change I made is F3 detection at boot time in OSLoader/start.c : if F3 pressed, display No system, this way one can reflash a calculator even if System ends up with as System panic. I also had problems with USB MSC mode that did not work until I exchanged the Views and mode string displays in the source code (start.c), and then it resume working. No idea why, perhaps a problem with my calculator... I'm now confident that the RAM swap is minimal inside KhiCAS, the lifetime of the flash should not be affected by swapping. I will now stop looking at the OS and concentrate on KhiCAS itself. |
The code in vmMgr_init() should be
mapList_AddPartitionMap(MAP_PART_RAWFLASH, PERM_R, VM_ROM_BASE, FLASH_SYSTEM_BLOCK * 64, VM_ROM_SIZE);
#if (VMRAM_USE_FTL == 0)
#endif
So that we can use two kinds of ram at the same time: not swapped and swapped. I think that the 1 bit per pixel screen buffer (4K) should never be swapped, leaving 264K for cache. SystemConfig.h would look like
#define USE_TINY_PAGE (1)
#define VMRAM_USE_FTL (0)
#define SEG_SIZE 1048576
#if VMRAM_USE_FTL
#if USE_TINY_PAGE
#ifdef ENABLE_AUIDIOOUT
#define NUM_CACHEPAGE ( 200 ) // 273 * 1 = 273 KB
#else
#define NUM_CACHEPAGE ( 268 ) // 273 * 1 = 273 KB
#endif
#else
#define NUM_CACHEPAGE ( 79 ) // 79 * 4 = 316 KB
#endif
#else
#if USE_TINY_PAGE
#define NUM_CACHEPAGE ( 264 )
#define VM_RAM_SIZE_NONE_FTL ( 4 * 1024 )
#else
#define NUM_CACHEPAGE ( 32 )
#define VM_RAM_SIZE_NONE_FTL ( 168 * 1024 )
#endif
#endif
The loader script Scripts/sys_ld.script would be:
MEMORY { vmRAM (rwx) : ORIGIN = 0x02040000, LENGTH = 2M vmROM (rx ) : ORIGIN = 0x00100000, LENGTH = 6M }
Changes in kcasporing_gl.c:
char * screen_1bpp=0x02000000;
and avoid initialization of virtual_screen in 1 bit per pixel mode:
if (!khicas_1bpp) memset(virtual_screen, COLOR_WHITE, VIR_LCD_PIX_H * VIR_LCD_PIX_W);
Now integrate(1/(x^4+1)) is 0.93s (normal mode) or 0.54s (fast CPU). By comparion on the Casio monochrom, it's 0.34s. For plot(sin(x)) in fast CPU mode, 0.06s vs 0.15s on the Casio. If we can spare RAM in OSLoader own use, it should be possible to improve the integrate benchmark!
The text was updated successfully, but these errors were encountered: