Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

针对大数据Hbase的导出,动态列的问题怎样解决? #418

Open
89333367 opened this issue Mar 8, 2024 · 0 comments
Open

针对大数据Hbase的导出,动态列的问题怎样解决? #418

89333367 opened this issue Mar 8, 2024 · 0 comments

Comments

@89333367
Copy link

89333367 commented Mar 8, 2024

描述
我在做一个Hbase数据的导出,量级在百万级别,要求能自动分Sheet,但是遇到了麻烦是,Hbase的不同行的列也是不同的,Hbase是列式存储,每行的列可能会不一样,比如第一行有ABC三列,第二行有AF两列,所以在导出的时候,遇到了titles的问题。

复现例子

List<String> bt = Arrays.asList("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K");


    List<Map> getRows(int page, int size) {//模拟Hbase不固定列的数据
        List<Map> rows = new ArrayList<>();
        for (int i = 0; i < size; i++) {
            Map<String, Object> m = new HashMap<>();
            long k = 0;
            for (int j = 0; j < page; j++) {
                int t = j;
                if (t > 10) {
                    t = 10;
                }
                String key = bt.get(t);
                String value = page + "_" + key + "_" + (i + 1) + "_" + (++k);
                m.put(key, value);
            }
            rows.add(m);
        }
        return rows;
    }


    @Test
    void t001() throws IOException {
        List<String> titles = new ArrayList<>();

        DefaultStreamExcelBuilder<Map> streamExcelBuilder = DefaultStreamExcelBuilder.of(Map.class);
        streamExcelBuilder.noStyle();
        streamExcelBuilder.capacity(10000);
        streamExcelBuilder.titles(titles);
        streamExcelBuilder.start();

        for (int i = 0; i < 10; i++) {
            List<Map> rows = getRows(i, 10);
            for (Map row : rows) {
                for (Object key : row.keySet()) {
                    if (!titles.contains(key.toString())) {
                        titles.add(key.toString());//将每一行返回的数据修改表头
                    }
                }
            }
            streamExcelBuilder.append(rows);
        }

        Workbook workbook = streamExcelBuilder.build();
        FileExportUtil.export(workbook, new File("d:/tmp/1.xlsx"));
        streamExcelBuilder.close();
    }

期望的结果
期望导出成功,表头正确,表头可以是所有行的列集合去重

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant